This article will be primarily useful for novice optimizers, because more advanced and so should know everything about them. In order to use this article with maximum efficiency, it is advisable to know exactly which words need to be raised to the right positions. If you're not exactly sure about the word list yet, or use the keyword suggestion service, it's a little confusing, but you can figure it out.
Important! Rest assured, Google understands perfectly well that ordinary users will not use them and only promotion specialists resort to their help. Therefore, Google may slightly distort the information provided.
Intitle operator:
Usage: intitle: word
Example: intitle: website promotion
Description: When using this operator, you will receive a list of pages, the title of which contains the word you are interested in, in our case it is the whole phrase "site promotion". Note that there must be no space after the colon. The title of the page is important for ranking, so be responsible for writing your titles. Using this variable, you will be able to estimate the approximate number of competitors who also want to be in the leading positions for this word.
Inurl operator:
Usage: inurl: phrase
Example: inurl: calculating the cost of search engine optimization
Description: This command shows sites or pages that have the original keyword in their URL. Note that there must be no space after the colon.
Inanchor operator:
Usage: inanchor: phrase
Example: inanchor: seo books
Description: Using this operator will help you see pages that are linked with the keyword you are using. This is a very important team, but unfortunately search engines are reluctant to share this information with SEOs for obvious reasons. There are services, Linkscape and Majestic SEO, that are willing to provide you with this information for a fee, but rest assured, the information is worth it.
Also, it is worth remembering that now Google pays more and more attention to the "trust" of the site and less and less link mass. Of course, links are still one of the most important factors, but trust plays an increasingly important role.
Good results are obtained by a combination of two variables, for example intitle: promotion inanchor: site promotion. And what we see, the search engine will show us the main competitors, the title of the page of which contains the word "promotion" and incoming links with the anchor "site promotion".
Unfortunately, this combination does not allow you to find out the "trust" of the domain, which, as we have already said, is very important factor... For example, a lot of old corporate sites don't have as many links as their younger competitors, but they do have a lot of old links that pull these sites to the top of the SERPs.
Site operator:
Usage: site: site address
Example: site: www.aweb.com.ua
Description: With this command, you can see a list of pages that are indexed by the search engine and which it knows about. It is mainly used to find out about competitors' pages and analyze them.
Cache operator:
Usage: cache: page address
Example: cache: www.aweb.com.ua
Description: This command shows a "snapshot" of the page from the moment when the robot visited the site the last time and, in general, how it sees the content of the page. By checking the page cache date, you can determine how often the robots visit the site. The more authoritative the site, the more often robots visit it and, accordingly, the less authoritative (according to Google) site, the less often the robots take pictures of the page.
The cache is very important when buying links. The closer the page's caching date is to the link purchase date, the faster your link will be indexed by the Google search engine. Sometimes it was possible to find pages with a cache age of 3 months. By purchasing a link on such a site, you will only waste your funds, because it is quite possible that the link will never be indexed.
Link operator:
Usage: link: url
Example: link: www.aweb.com.ua
Description: Link operator: searches and displays pages that link to specified address url. This can be both the home page of the site and the internal one.
Related operator:
Usage: related: url
Example: related: www.aweb.com.ua
Description: The related: statement displays pages that the search engine thinks are similar to the specified page. For a person, all the resulting pages may not have anything similar, but for a search engine it is.
Info statement:
Usage: info: url
Example: info: www.aweb.com.ua
Description: By using this operator, we will be able to obtain information about the page that is known to the search engine. This could be the author, publication date, and more. Additionally, on the search page, Google offers several actions at once that it can do with this page. Or, more simply, he will suggest using some of the operators that we described above.
Allintitle operator:
Usage: allintitle: phrase
Example: allintitle: aweb promotion
Description: If we start our search query with this word, we get a list of pages that have the entire phrase in the title. For example, if we try to search for the word allintitle: aweb promotion, we will get a list of pages, in the title of which both of these words are mentioned. And they do not have to follow each other at all, they can be located in different places in the header.
Allintext operator:
Usage: allintext: word
Example: allintext: optimization
Description: This operator searches for all pages that contain the specified word in the body of their text. If we try to use allintext: aweb optimization, we will see a list of pages in the text of which these words are found. That is, not the whole phrase is "optimization aweb", but both words and "optimization" and "aweb".
Any search for vulnerabilities on web resources begins with intelligence and information gathering.
Intelligence can be either active - brute-force files and directories of the site, launching vulnerability scanners, manual browsing of the site, and passive - searching for information in different search engines. Sometimes it happens that a vulnerability becomes known even before the first page of the site opens.
How is this possible?
Search robots, non-stop wandering the Internet, in addition to information useful to an ordinary user, often record what can be used by cybercriminals in an attack on a web resource. For example, script errors and files with sensitive information (from configuration files and logs to files with authentication data and database backups).
From the point of view of a search robot, an error message about an sql query execution is plain text, inseparable, for example, from the description of products on the page. If suddenly a search robot came across a file with the .sql extension, which for some reason ended up in working folder site, then it will be perceived as part of the content of the site and will also be indexed (including, possibly, the passwords specified in it).
Such information can be found by knowing stable, often unique, keywords that help separate “vulnerable pages” from pages that do not contain vulnerabilities.
A huge database of special queries using keywords (so-called dorks) exists on exploit-db.com and is known as the Google Hack Database.
Why google?
Doors are targeted primarily at google for two reasons:
- the most flexible syntax for keywords (shown in Table 1) and special characters (shown in Table 2);
- the google index is still more complete than that of other search engines;
Table 1 - Main google keywords
Keyword |
Meaning |
Example |
site |
Search only on the specified site. Only takes into account url |
site: somesite.ru - will find all pages by this domain and subdomains |
inurl |
Search by words present in uri. Unlike cl. the words "site", searches for matches after the site name |
inurl: news - find all pages where the given word is found in uri |
intext |
Search in the body of the page |
intext: "plugs" - completely similar to the usual query "plugs" |
intitle |
Search in the title of the page. Text enclosed between tags |
intitle: ”index of” - will find all pages with directory listing |
ext |
Search for pages with the specified extension |
ext: pdf - Finds all pdfs |
filetype |
Currently, it is completely analogous to cl. the word "ext" |
filetype: pdf - similar |
related |
Search for sites with similar topics |
related: google.ru - will show its analogs |
link |
Search for sites that link to this |
link: somesite.ru - will find all sites that have a link to this |
define |
Show word definition |
define: 0day - term definition |
cache |
Show page content in cache (if available) |
cache: google.com - will open the page from the cache |
Table 2 - Special characters for google queries
Symbol |
Meaning |
Example |
“ |
Exact phrase |
intitle: "RouterOS router configuration page" - search for routers |
* |
Any text |
inurl: "bitrix * mcart" - search for sites on bitrix with a vulnerable mcart module |
. |
Any character |
Index.of - similar to request index of |
- |
Exclude word |
error -warning - show all pages with error but no warning |
.. |
Range |
cve 2006..2016 - show vulnerabilities by year since 2006 |
| |
Boolean "or" |
linux | windows - show pages where either the first or second word occurs |
It should be understood that any request to a search engine is a search only by words.
It is useless to search the page for meta-characters (quotes, brackets, punctuation marks, etc.). Even a search for an exact phrase indicated in quotes is a word search, followed by a search for an exact match already in the results.
All Google Hack Database dorks are logically divided into 14 categories and are presented in Table 3.
Table 3 - Google Hack Database Categories
Category |
What allows you to find |
Example |
Footholds |
Web shells, public file managers |
Find all hacked sites where the listed webshells are uploaded: (intitle: "phpshell" OR intitle: "c99shell" OR intitle: "r57shell" OR intitle: "PHP Shell" OR intitle: "phpRemoteView") `rwx`" uname " |
Files containing usernames |
Registry files, configuration files, logs, files containing the history of commands entered |
Find all registry files containing account information: filetype: reg reg + intext: "internet account manager" |
Sensitive Directories |
Directories with various information (personal documents, vpn configs, hidden repositories, etc.) |
Find all listings of directories containing files related to vpn: "Config" intitle: "Index of" intext: vpn Sites containing git repositories: (intext: "index of /.git") ("parent directory") |
Web Server Detection |
Version and other information about the web server |
Find the JBoss server administrative consoles: inurl: "/ web-console /" intitle: "Administration Console" |
Vulnerable Files |
Scripts containing known vulnerabilities |
Find sites that use a script that allows you to download an arbitrary file from the server: allinurl: forcedownload.php? file = |
Vulnerable Servers |
Installation scripts, web shells, open administrative consoles, etc. |
Find open PHPMyAdmin consoles running as root: intitle: phpMyAdmin "Welcome to phpMyAdmin ***" "running on * as root @ *" |
Error Messages |
Various errors and warnings often reveal important information- from CMS version to passwords |
Sites that have errors in executing sql queries to the database: "Warning: mysql_query ()" "invalid query" |
Files containing juicy info |
Certificates, backups, emails, logs, sql scripts, etc. |
Find initialization sql scripts: filetype: sql and "insert into" -site: github.com |
Files containing passwords |
Everything that can contain passwords - logs, sql scripts, etc. |
Logs mentioning passwords: filetype:logintext:password |pass |pw sql scripts containing passwords: ext:sqlintext:usernameintext:password |
Sensitive Online Shopping Info |
Information related to online shopping |
Find pincodes: dcid =bn =pincode = |
Network or vulnerability data |
Information not directly related to the web resource, but affecting the network or other non-web services |
Find scripts automatic tuning proxies containing information about the internal network: inurl: proxy | inurl: wpad ext: pac | ext: dat findproxyforurl |
Pages containing login portals |
Pages containing login forms |
Saplogon webpages: intext: "2016 SAP AG. All rights reserved. " intitle: "Logon" |
Various Online Devices |
Printers, routers, monitoring systems, etc. |
Find the printer configuration panel: intitle: "hplaserjet "inurl:SSI /Auth /set_config_deviceinfo.htm |
Advisories and Vulnerabilities |
Websites on vulnerable CMS versions |
Find vulnerable plugins through which you can upload an arbitrary file to the server: inurl: fckeditor -intext: "ConfigIsEnabled = False" intext: ConfigIsEnabled |
Doors are more often focused on searching all sites on the Internet. But nothing prevents you from limiting the search area on any site or sites.
Each google request can be focused on a specific site by adding the keyword "site: somesite.com" to the request. This keyword can be added to any dork.
Vulnerability search automation
So the idea was born to write a simple utility that automates the search for vulnerabilities using the search engine (google) and relies on the Google Hack Database.
The utility is a script written in nodejs using phantomjs. To be precise, the script is interpreted by phantomjs itself.
Phantomjs is a complete web browser without graphical interface, managed with js-code and has a convenient API.
The utility has received a quite understandable name - dorks. By running it in command line(without options) we get a short help with several examples of use:
Figure 1 - List of basic dorks options
The general syntax of the utility is dork "command" "list of options".
A detailed description of all options is presented in table 4.
Table 4 - dorks syntax
Command |
Option |
Description |
ghdb |
-l |
Display a numbered list of Google Hack Database dork categories |
-c "category number or name" |
Load tracks of the specified category by number or name |
|
-q "phrase" |
Download dorks found on request |
|
-o "file" |
Save result to file (only with -c | -q options) |
|
google |
-d dork |
Set an arbitrary dork (this option can be used many times, can be combined with the -D option) |
-D "file" |
Use dorks from file |
|
-s "site" |
Set site (this option can be used many times, it can be combined with the -S option) |
|
-S "file" |
Use sites from a file (brute-force dorks will be performed for each site independently) |
|
-f "filter" |
Set additional keywords (will be added to each dork) |
|
-t "number of ms" |
The interval between requests to google |
|
-T "number of ms" |
Timeout if a captcha is encountered |
|
-o "file" |
Save the result to a file (only those tracks for which something was found will be saved) |
Using the ghdb command, you can get all the dorks from exploit-db to arbitrary request, or indicate the entire category. If you specify category 0, the entire database will be unloaded (about 4.5 thousand dorks).
List of categories available on this moment is shown in Figure 2.
Figure 2 - List available categories Dork GHDB
The google command will substitute each dork in the google search engine and analyze the result for matches. The roads along which something was found will be saved to a file.
The utility supports different search modes:
1 dork and 1 site;
1 dork and many sites;
1 site and many dorks;
many sites and many dorks;
The list of dorks and sites can be specified either through an argument or through a file.
Demonstration of work
Let's try to search for any vulnerabilities using the example of searching for error messages. By command: dorks ghdb –c 7 –o errors.dorks all known dorks of the “Error Messages” category will be loaded as shown in Figure 3.
Figure 3 - Loading all known dorks of the "Error Messages" category
The tracks are loaded and saved to a file. Now it remains to "incite" them to some site (see Figure 4).
Figure 4 - Searching for vulnerabilities of the site of interest in the google cache
After some time, several pages containing errors are found on the studied site (see Figure 5).
Figure 5 - Found error messages
As a result, in the file result.txt we get a complete list of dorks that lead to the error.
Figure 6 shows the result of searching for site errors.
Figure 6 - Error search result
In the cache for this dork, a full backtrace is displayed, revealing the absolute paths of the scripts, the site's content management system and the type of database (see Figure 7).
Figure 7 - Disclosure of information about the site device
However, it should be borne in mind that not all Dorks from GHDB give a true result. Also, google may not find an exact match and show a similar result.
In this case, it is wiser to use your personal dork list. For example, you should always look for files with "unusual" extensions, examples of which are shown in Figure 8.
Figure 8 - List of file extensions not typical for a regular web resource
As a result, by the command dorks google –D extensions.txt –f bank, from the very first request google starts giving sites with “unusual” file extensions (see Figure 9).
Figure 9 - Searching for "bad" file types on banking sites
It should be borne in mind that google does not accept requests longer than 32 words.
With dorks google –d intext: ”error | warning | notice | syntax” –f university
you can look for PHP interpreter errors on educational sites (see Figure 10).
Figure 10 - Search for PHP runtime errors
Sometimes it is not convenient to use one or two categories of dorks.
For example, if you know that the site is running on the wordpress engine, then you need dorks for wordpress. In this case, it is convenient to use the Google Hack Database search. The dorks ghdb –q wordpress –o wordpress_dorks.txt command will download all Wordpress dorks, as shown in Figure 11:
Figure 11 - Searching for Wordpress related dorks
Let's go back to the banks again and use the dorks google –D wordpress_dords.txt –f bank command to try to find something interesting related to wordpress (see Figure 12).
Figure 12 - Searching for Wordpress Vulnerabilities
It is worth noting that a search on Google Hack Database does not accept words shorter than 4 characters. For example, if the CMS of the site is not known, but the language is known - PHP. In this case, you can filter what you need manually using the pipe and the system search utility dorks –c all | findstr / I php> php_dorks.txt (see figure 13):
Figure 13 - Searching all dorks where PHP is mentioned
Search for vulnerabilities or some sensitive information in a search engine should be searched only if there is a significant index on this site. For example, if a site has 10-15 pages indexed, then it is silly to search for something in this way. It's easy to check the size of the index - just enter it into the string google search"Site: somesite.com". An example of an under-indexed site is shown in Figure 14.
Figure 14 - Checking the size of the site index
Now about the unpleasant ... From time to time google may request a captcha - there is nothing to be done - it will have to be entered. For example, when I went through the “Error Messages” category (90 dorks), the captcha fell out only once.
It is worth adding that phantomjs also supports work through a proxy, both through the http and socks interface. To enable proxy mode, you need to uncomment the corresponding line in dorks.bat or dorks.sh.
The tool is available as source code
Even the best content will be of little use if you do not convey information about its existence to your target audience. This is why seeding content is one of the key tasks of a marketer.
Yes, such a task is not easy at all. However, if everything works out, the result can exceed even very daring expectations - there are many examples of how just one high-quality article can generate thousands of conversions to a site from a variety of sources.
Today I will tell you how to look for sowing sites, by what criteria to select them, and I will share a number of interesting sources, the existence of which, I am sure, many of my readers did not even know.
Main seeding channels
Any sources that can attract the attention of your target audience and provide additional transitions to the site will do.
Pay attention to channels such as:
- popular thematic resources in your niche;
- blogging platforms and communities;
- forums;
- e-mailing;
- social networks;
- YouTube.
A good study of even one of these areas can provide you with a decent number of visitors per article. All together, they can help in creating a real flow of referral traffic to the site, which can be converted into sales, leads, or provide the resource owner with a significant increase in influence and the formation of an expert's image.
What is the principle for selecting sites?
There are only two of them - the correspondence to the subject of the article, the seeding of which is planned, and, the presence of the target audience in a large number.
WITH social platforms everything is clear, the audience of participants here usually numbers millions of users. But if we talk about sites and forums, pay attention to traffic, as it would be silly to expect a large number of transitions from a resource that is visited by 1000 people per month.
In one of the previous articles, I already wrote how you can find out. In the context of our goals, this will definitely come in handy.
Now let's take a closer look at the examples of the selection of sites for sowing, for each of the listed channels.
1. Thematic sites
The essence of the work: we are looking for a site suitable for the topic with good traffic, and post material on it with a link to our resource or an article published on it. To make it look as natural as possible, you can, for example, make a digest of something, and just include a link to yourself as one of the list items.
For example, let's write an article titled "The 5 Best Translator Apps for iPhone", in which we include an application developed by our client, and leave a link like " detailed review read the appendices here. "
The easiest way to find sites for placement is the directory of sites in the Miralinks article exchange. The downside is that the number of sites themselves there is limited, and the credibility of many is in doubt, due to the fact that their owners post paid articles in batches.
The second way is to analyze the top resources in your niche and competitors for referral traffic. You can do this using Similarweb.com. Just drive in the address of the site you want and look at the "Referrals" tab:
And the third, the most difficult and time-consuming, but at the same time, the most effective method is search results for information requests in your topic. Thus, you can agree to place your link in an article that is already in the Top of Search Engines:
2. Blogging platforms and communities
Platforms such as LiveInternet.ru and LiveJournal.com are actively used by many users for personal blogs on a variety of topics. For a small fee (however, not always), you can agree to place a mention of your article in a thematic note, which will allow you to get referrals from interested readers.
You can also find various communities for a specific niche, for example, HabraHabr.ru for IT-products and services, and Babyblog.ru - for promotion in the "Home and Family" topic. This also includes various services questions and answers.
3. Forums
Here it is best to act with the method of crowd marketing, because a link to a high-quality informational article can be easily left in the topic thread on the forum, and at the same time, it will look completely organic in the context of the discussion. But the link to the product in the online store or the service page, the moderators would clean up quickly, there is no doubt about it.
A more cunning move is to agree on the placement of the link you need by one of the authoritative users of the forum, then there will definitely not be anything to dig into. Well, in some cases, paid placement will also be justified - the administration of many forums allows you to create the topic you need, for a separate topic.
To collect a database of sites, we analyze search results using a query of the form inurl: forum "keyword" , for example, like this:
Additionally, you can set the filtering of the search results by time.
4. Email newsletter
5. Social networks
My favorite way of seeding content that works great for just about any topic. In general, it is difficult to imagine modern content marketing without active work with social networks.
V this case several methods can be used:
The result can be just great:
It is quite simple to search for communities for posting on VKontakte - we drive the desired keyword into the search, click "All results" and mark the item "Communities" in the sidebar on the right. In the search parameters, you can also specify the type of community and how the results are sorted:
Then we just select the ones we need and write our proposal to the administrators.
Also, you can use the official exchange, for this, after logging into your VKontakte profile, in the left menu at the bottom, click the item "Advertising", and switch to the "Advertising in communities" tab. After creating an advertising post, you will have access to a huge selection of sites for placement:
With Facebook, things are much more complicated. Normal search for communities within itself social network no, so you need to search for them by special ratings / tops, for example, the Facebook community rating, with the ability to sort by different countries.
It's easier with pages - almost every promoted site has Official page on Facebook, and you can unsubscribe directly to its owner, with a proposal to post your publication, or repost.
6. YouTube
Of all the sowing methods listed in this article, this option is used the least often. Not because it is ineffective, but simply because many do not even know about it. A great way to beat the competition and skim the cream off.
The essence is simple: we are looking for thematic channels or ready-made videos in the search, and we agree to place the link we need in the description. In the case of a channel, if the budget allows, you can even agree on a video review exclusively for your product.
For example, we have an interesting and useful article on rose care published on the blog of an online store that sells seeds and seedlings. We drive the corresponding request into the YouTube search:
If necessary, set up a filter by time. We are looking for a video from big amount viewing or one that is growing in popularity. And, we agree with the owners of the channel for placing a link to your article in the video description, with a recommendation:
To find out the contacts of the owner of the YouTube channel, click on its name:
Then, click on the "About channel" tab:
And there is already everything you need:
Let's summarize
The content seeding technique I described in this article can seem very complicated and costly in terms of time / money. But, believe me, it's worth it, and the effect will be an order of magnitude better than that of spam by the hands of schoolchildren who, under the guise of “crowd marketing,” try to sell streaming SEO cents to their clients.
Regular work in this area helps to build a brand! And over time, it promotes organic growth of backlinks and mentions of the company, without additional effort on the part of a specialist.
But the main thing is that over time, you can break away from competitors to an almost unattainable level. At the very least, you will be separated by a barrier in the form of dozens of publications, thousands of clicks, and hundreds of backlinks. And even a competitor with a very large budget will not be able to overcome it quickly.
And do not forget that quality content must be at the heart of everything. If you try to distribute boring, low-quality and second-rate articles, no one will pay attention to them anyway. Even if your promotion budgets will be with a few zeros.
- Part 1.
- Part 2.
- Part 3.
- Part 4.
- Part 5. Start sowing
- Part 6..
Search system Google (www.google.com) provides many search options. All these capabilities are an invaluable search tool for a first-time Internet user and at the same time an even more powerful weapon of invasion and destruction in the hands of people with evil intentions, including not only hackers, but also non-computer criminals and even terrorists.
(9475 views in 1 week)
Denis Batrankov
denisNOSPAMixi.ru
Attention:This article is not a guide to action. This article was written for you, administrators of WEB servers, so that you lose the false feeling that you are safe, and you finally understand the insidiousness of this method of obtaining information and take up the protection of your site.
Introduction
For example, I found 1670 pages in 0.14 seconds!
2. Let's introduce another line, for example:
inurl: "auth_user_file.txt"a little less, but this is already enough for free download and for brute-force attacks (using the same John The Ripper). Below I will give some more examples.
So, you need to realize that the Google search engine has visited most of the sites on the Internet and stored in the cache the information they contain. This cached information allows you to get information about the site and the content of the site without a direct connection to the site, just digging into the information that is stored inside Google. Moreover, if the information on the site is no longer available, then the information in the cache may still be preserved. All it takes for this method is to know some of the key Google words... This technique is called Google Hacking.
For the first time, information about Google Hacking appeared on the Bugtruck mailing list 3 years ago. In 2001, this topic was brought up by a French student. Here is a link to this letter http://www.cotse.com/mailing-lists/bugtraq/2001/Nov/0129.html. It provides the first examples of such requests:
1) Index of / admin
2) Index of / password
3) Index of / mail
4) Index of / + banques + filetype: xls (for france ...)
5) Index of / + passwd
6) Index of / password.txt
This topic made a splash in the English-reading part of the Internet quite recently: after Johnny Long's article published on May 7, 2004. For a more complete study of Google Hacking, I advise you to visit this author's site at http://johnny.ihackstuff.com. In this article, I just want to bring you up to date.
Who can use it:
- Journalists, spies and all those people who like to poke their nose into other matters can use this to search for compromising evidence.
- Hackers looking for suitable targets for hacking.
How Google works.
To continue the conversation, let me remind you of some of the keywords used in Google queries.
Search using the + sign
Google excludes words that are unimportant in its opinion from the search. For example, question words, prepositions and articles in English: for example are, of, where. In Russian Google language seems to consider all words important. If the word is excluded from the search, then Google writes about it. In order for Google to start searching for pages with these words in front of them, you need to add a + sign without a space in front of the word. For example:
ace + of base
Search using a sign -
If Google finds a large number of pages from which it is necessary to exclude pages with a specific topic, then you can force Google to search only for pages that do not have specific words. To do this, you need to indicate these words, putting in front of each sign - without a space in front of the word. For example:
fishing vodka
Search using ~
You may want to find not only the specified word, but also its synonyms. To do this, precede the word with the ~ symbol.
Finding the exact phrase using double quotes
Google searches on each page for all occurrences of the words that you wrote in the query string, and it does not care about the relative position of words, the main thing is that all the specified words are on the page at the same time (this is the default action). To find the exact phrase, you need to put it in quotes. For example:
"bookend"
To have at least one of the specified words, you must specify logical operation explicitly: OR. For example:
book safety OR protection
In addition, in the search bar, you can use the * sign to denote any word and. to denote any character.
Finding words using additional operators
Exists search operators, which are indicated in the search string in the format:
operator: search_term
Spaces next to the colon are not needed. If you insert a space after the colon, you will see an error message, and before it, then Google will use them as a normal search string.
There are groups of additional search operators: languages - indicate in what language you want to see the result, date - limit results for the past three, six or 12 months, occurrences - indicate where in the document you need to search for a string: everywhere, in the title, in the URL, domains - search the specified site or, on the contrary, exclude it from the search, safe search - block sites containing the specified type of information and remove them from the search results pages.
At the same time, some operators do not need an additional parameter, for example, the query " cache: www.google.com"can be called as a full-fledged search string, and some keywords, on the contrary, require a search word, for example" site: www.google.com help". In light of our topic, let's look at the following operators:
Operator |
Description |
Requires an additional parameter? |
search only on the site specified in search_term |
||
search only in documents with the search_term type |
||
find pages containing search_term in title |
||
find pages containing all the words search_term in the title |
||
find pages containing the word search_term in their url |
||
find pages containing all the words search_term in their url |
Operator site: restricts the search only to the specified site, and you can specify not only the domain name, but also the IP address. For example, enter:
Operator filetype: restricts searches to files of a specific type. For example:
As of the article's release date, Google can search within 13 different file formats:
- Adobe Portable Document Format (pdf)
- Adobe PostScript (ps)
- Lotus 1-2-3 (wk1, wk2, wk3, wk4, wk5, wki, wks, wku)
- Lotus WordPro (lwp)
- MacWrite (mw)
- Microsoft Excel(xls)
- Microsoft PowerPoint (ppt)
- Microsoft Word(doc)
- Microsoft Works (wks, wps, wdb)
- Microsoft Write (wri)
- Rich Text Format (rtf)
- Shockwave Flash(swf)
- Text (ans, txt)
Operator link: shows all pages that point to the specified page.
It's probably always interesting to see how many places on the Internet know about you. Trying:
Operator cache: shows the version of the site in Google's cache as it looked when Google last visited this page. We take any site that changes frequently and look at:
Operator intitle: searches for the specified word in the page title. Operator allintitle: is an extension - it looks for all specified multiple words in the page title. Compare:
intitle: flight to mars
intitle: flight intitle: to intitle: mars
allintitle: flight to mars
Operator inurl: makes Google show all pages containing the specified string in the URL. Operator allinurl: Searches for all words in a URL. For example:
allinurl: acid acid_stat_alerts.php
This command is especially useful for those who do not have SNORT - at least they can see how it works on a real system.
Hacking Methods Using Google
So, we found out that using a combination of the above operators and keywords, anyone can collect the information you need and search for vulnerabilities. These techniques are often referred to as Google Hacking.
map of site
You can use the site: operator to see all the links that Google finds on the site. Usually, pages that are dynamically created by scripts are not indexed using parameters, so some sites use ISAPI filters so that links are not in the form /article.asp?num=10&dst=5, and with slashes / article / abc / num / 10 / dst / 5... This is done so that the site is generally indexed by search engines.
Let's try:
site: www.whitehouse.gov whitehouse
Google thinks every page on the site contains the word whitehouse. This is what we use to get all the pages.
There is also a simplified version:
site: whitehouse.gov
And the best part is that the comrades from whitehouse.gov did not even know that we looked at the structure of their site and even looked into the cached pages that Google downloaded for itself. This can be used to study the structure of sites and view content without being noticed for the time being.
Viewing a list of files in directories
WEB servers can show lists of server directories instead of usual HTML pages... This is usually done to get users to select and download specific files. However, in many cases, it is not for administrators to show the contents of a directory. This is due to incorrect server configuration or lack of home page in the directory. As a result, the hacker has a chance to find something interesting in the directory and use it for his own purposes. To find all such pages, just notice that they all contain the words: index of in their title. But since the words index of contain not only such pages, we need to clarify the query and take into account the keywords on the page itself, so queries of the form are suitable for us:
intitle: index.of parent directory
intitle: index.of name size
Since most of the directory listings are intentional, you may find it difficult to find erroneously displayed listings the first time. But at least you can already use the listings to determine the version of the WEB server, as described below.
Getting the version of the WEB server.
Knowing the version of the WEB server is always useful before starting any hacker attack. Again thanks to Google it is possible to get this information without connecting to the server. If you look closely at the listing of the directory, you can see that the name of the WEB server and its version are displayed there.
Apache1.3.29 - ProXad Server at trf296.free.fr Port 80
An experienced administrator can change this information, but, as a rule, it is true. Thus, to get this information, it is enough to send a request:
intitle: index.of server.at
To get information for a specific server, we clarify the request:
intitle: index.of server.at site: ibm.com
Or vice versa, we are looking for servers running on a specific server version:
intitle: index.of Apache / 2.0.40 Server at
This technique can be used by a hacker to find a victim. If, for example, he has an exploit for a specific version of the WEB server, then he can find it and try the existing exploit.
You can also get the server version by looking at the pages that are installed by default when installing a fresh version of the WEB server. For example, to see the Apache 1.2.6 test page, just type
intitle: Test.Page.for.Apache it.worked!
Moreover, some OS during installation, they immediately install and start the WEB server. At the same time, some users are not even aware of this. Naturally, if you see that someone has not deleted the default page, then it is logical to assume that the computer has not been subjected to any configuration at all and is probably vulnerable to attacks.
Try to find IIS 5.0 pages
allintitle: Welcome to Windows 2000 Internet Services
In the case of IIS, you can determine not only the server version, but also Windows version and Service Pack.
Another way to determine the version of the WEB server is to search for manuals (help pages) and examples that can be installed on the site by default. Hackers have found many ways to use these components to gain privileged access to the site. That is why you need to remove these components on the production site. Not to mention the fact that by the presence of these components you can get information about the type of server and its version. For example, let's find the apache manual:
inurl: manual apache directives modules
Using Google as a CGI scanner.
CGI scanner or WEB scanner is a utility for finding vulnerable scripts and programs on the victim's server. These utilities should know what to look for, for this they have a whole list of vulnerable files, for example:
/cgi-bin/cgiemail/uargg.txt
/random_banner/index.cgi
/random_banner/index.cgi
/cgi-bin/mailview.cgi
/cgi-bin/maillist.cgi
/cgi-bin/userreg.cgi
/iissamples/ISSamples/SQLQHit.asp
/SiteServer/admin/findvserver.asp
/scripts/cphost.dll
/cgi-bin/finger.cgi
We can find each of these files using Google, using the words index of or inurl in addition to the file name in the search bar: we can find sites with vulnerable scripts, for example:
allinurl: /random_banner/index.cgi
Using additional knowledge, a hacker can exploit a script vulnerability and use this vulnerability to force the script to return any file stored on the server. For example a password file.
How to protect yourself from Google hacking.
1. Do not post important data to the WEB server.
Even if you posted the data temporarily, then you can forget about it, or someone will have time to find and pick up this data before you erase it. Don't do that. There are many other ways to transfer data to protect it from theft.
2. Check your site.
Use the methods described to research your site. Check your site periodically with new methods that appear on the site http://johnny.ihackstuff.com. Remember that if you want to automate your actions, you need to get special permission from Google. If you read carefully http://www.google.com/terms_of_service.html then you will see the phrase: You may not send automated queries of any sort to Google "s system without express permission in advance from Google.
3. You may not need Google to index your site or part of it.
Google allows you to remove a link to your site or part of it from its database, as well as remove pages from the cache. In addition, you can prohibit the search for images on your site, prohibit showing short fragments of pages in search results. All options for deleting a site are described on the page http://www.google.com/remove.html... To do this, you must confirm that you are really the owner of this site or insert tags into the page or
4. Use robots.txt
It is known that search engines look into the robots.txt file located at the root of the site and do not index those parts that are marked with the word Disallow... You can take advantage of this to prevent part of the site from being indexed. For example, to avoid indexing the entire site, create a robots.txt file containing two lines:
User-agent: *
Disallow: /
What else happens
So that life does not seem like honey to you, I will say in the end that there are sites that follow those people who, using the above methods, are looking for holes in scripts and WEB servers. An example of such a page is
Application.
A little bit sweet. Try something from the following list yourself:
1. #mysql dump filetype: sql - find dumps of mySQL databases
2. Host Vulnerability Summary Report - will show you what vulnerabilities other people have found
3.phpMyAdmin running on inurl: main.php - this will force close control via phpmyadmin panel
4.not for distribution confidential
5. Request Details Control Tree Server Variables
6. Running in Child mode
7. This report was generated by WebLog
8.intitle: index.of cgiirc.config
9.filetype: conf inurl: firewall -intitle: cvs - can anyone need firewall configuration files? :)
10. intitle: index.of finances.xls - hmm ....
11.intitle: Index of dbconvert.exe chats - icq chat logs
12.intext: Tobias Oetiker traffic analysis
13.intitle: Usage Statistics for Generated by Webalizer
14.intitle: statistics of advanced web statistics
15.intitle: index.of ws_ftp.ini - ws ftp config
16.inurl: ipsec.secrets holds shared secrets - the secret key is a good find
17.inurl: main.php Welcome to phpMyAdmin
18.inurl: server-info Apache Server Information
19.site: edu admin grades
20. ORA-00921: unexpected end of SQL command - getting paths
21. intitle: index.of trillian.ini
22. intitle: Index of pwd.db
23. intitle: index.of people.lst
24. intitle: index.of master.passwd
25. inurl: passlist.txt
26. intitle: Index of .mysql_history
27. intitle: index of intext: globals.inc
28. intitle: index.of administrators.pwd
29. intitle: Index.of etc shadow
30. intitle: index.of secring.pgp
31.inurl: config.php dbuname dbpass
32. inurl: perform filetype: ini
Training center "Informzashita" http://www.itsecurity.ru - a leading specialized center in the field of training information security(License of the Moscow Education Committee No. 015470, State accreditation No. 004251). The only authorized training center for companies Internet Security Systems and Clearswift in Russia and the CIS countries. Microsoft Authorized Training Center (Security specialization). The training programs are coordinated with the State Technical Commission of Russia, the FSB (FAPSI). Certificates of training and state documents on professional development.
SoftKey is a unique service for buyers, developers, dealers and affiliate partners. In addition, it is one of best online stores Software in Russia, Ukraine, Kazakhstan, which offers customers a wide assortment, many payment methods, prompt (often instant) order processing, tracking the order fulfillment process in the personal section, various discounts from the store and software manufacturers.
Getting private data doesn't always mean hacking - sometimes it's published in public access... Knowing the Google settings and a little bit of ingenuity will allow you to find a lot of interesting things - from credit card numbers to FBI documents.
WARNING
All information is provided for informational purposes only. Neither the editorial board nor the author is responsible for any possible harm caused by the materials of this article.Today they connect everything to the Internet, caring little about restricting access. Therefore, a lot of private data becomes the prey of search engines. Spider robots are no longer limited to web pages, but index all the content available on the Web and constantly add undisclosed information to their databases. Finding out these secrets is easy - you just need to know exactly how to ask about them.
Looking for files
In the right hands, Google will quickly find everything that is bad on the Web - for example, personal information and files for official use. They are often hidden like a key under a rug: there are no real access restrictions, the data just lies on the backyard of the site, where links do not lead. Google's standard web interface only provides basic settings advanced search, but even those will suffice.
You can use two operators to limit your search to specific file types on Google using filetype and ext. The first specifies the format that the search engine determined by the file title, the second - the file extension, regardless of its internal content. When searching in both cases, you only need to specify the extension. Initially, the ext operator was convenient to use in cases where the file did not have specific format features (for example, to search for ini and cfg configuration files, inside which there could be anything). Now Google's algorithms have changed, and there is no visible difference between operators - the results in most cases come out the same.
Filtering the issue
By default, Google searches for words and, in general, any entered characters in all files on indexed pages. You can limit the search scope by domain top level, a specific site or by the location of the desired sequence in the files themselves. For the first two options, the operator site is used, followed by the name of the domain or the selected site. In the third case, a whole set of operators allows you to search for information in service fields and metadata. For example, allinurl will find the specified in the body of the links themselves, allinanchor - in the text with the tag , allintitle - in the page titles, allintext - in the body of the pages.
For each operator there is a light version with a shorter name (without the all prefix). The difference is that allinurl will find links with all words, while inurl will only find links with the first one. The second and subsequent words from the query can appear anywhere on web pages. The inurl operator also differs from another, similar in meaning - site. The former also allows you to find any sequence of characters in a link to the searched document (for example, / cgi-bin /), which is widely used to find components with known vulnerabilities.
Let's try it in practice. We take the allintext filter and make the request return a list of credit card numbers and verification codes, which will expire only after two years (or when their owners get tired of feeding everyone in a row).
Allintext: card number expiration date / 2017 cvv
When you read in the news that a young hacker "hacked into the servers" of the Pentagon or NASA, stealing classified information, then in most cases we are talking about just such an elementary technique of using Google. Suppose we are interested in a list of NASA employees and their contact details. Surely there is such a list in electronic form. For convenience or by oversight, it can also be found on the organization's website itself. It is logical that in this case there will be no links to it, since it is intended for internal use. What words can be in such a file? At least - the "address" field. Testing all these assumptions is easy.
Inurl: nasa.gov filetype: xlsx "address"
We use bureaucracy
Finds like these are a nice little thing. A really solid catch provides a more detailed knowledge of Google operators for webmasters, the Web itself, and the structure of what they are looking for. Knowing the details, you can easily filter the results and clarify the properties of the files you need in order to get really valuable data in the rest. It's funny that bureaucracy comes to the rescue here. It produces typical formulations that make it convenient to search for secret information accidentally leaked into the Web.
For example, the Distribution statement stamp, which is mandatory in the office of the US Department of Defense, means standardized restrictions on the distribution of a document. Letter A denotes public releases in which there is nothing secret; B - for internal use only, C - strictly confidential, and so on up to F. Separately, there is the letter X, which marks especially valuable information representing a state secret of the highest level. Let such documents be searched for by those who are supposed to do it on duty, and we will restrict ourselves to files with the letter C. According to the DoDI directive 5230.24, such marking is assigned to documents containing a description of critical technologies falling under export control. Such highly guarded information can be found on sites in the .mil top-level domain dedicated to the US Army.
"DISTRIBUTION STATEMENT C" inurl: navy.mil
It is very convenient that the .mil domain contains only sites from the US Department of Defense and its contract organizations. Domain-restricted search results are exceptionally clean, and the headlines are self-explanatory. It is practically useless to search for Russian secrets in this way: chaos reigns in the .ru and.rf domains, and the names of many weapons systems sound botanical (PP "Cypress", ACS "Akatsiya") or completely fabulous (TOS "Buratino").
By carefully examining any document from a site in the .mil domain, you can see other markers to refine your search. For example, a reference to export restrictions "Sec 2751", which is also convenient to search for interesting technical information. From time to time, it is withdrawn from the official sites, where it was once lit up, so if you cannot follow an interesting link in the search results, use Google's cache (operator cache) or the Internet Archive site.
Climbing into the clouds
In addition to accidentally declassified government documents, Google's cache occasionally pops up links to personal files from Dropbox and other storage services that create "private" links to publicly released data. It's even worse with alternative and homemade services. For example, the following request finds data from all Verizon clients who have an FTP server installed and actively used on their router.
Allinurl: ftp: // verizon.net
There are now more than forty thousand such smart people, and in the spring of 2015 there were an order of magnitude more. Instead of Verizon.net, you can substitute the name of any well-known provider, and the more famous it is, the bigger the catch can be. Through the built-in FTP server, you can see the files on the external storage connected to the router. Usually this is a NAS for remote work, a personal cloud or some kind of peer-to-peer file download. All the contents of such media are indexed by Google and other search engines, so you can access files stored on external drives using a direct link.
Peeping configs
Before the widespread migration to the clouds, simple FTP servers, which also had enough vulnerabilities, ruled as remote storages. Many of them are still relevant today. For example, the popular WS_FTP Professional program stores configuration data, user accounts, and passwords in the ws_ftp.ini file. It is easy to find and read as all records are stored in plain text and passwords are encrypted with Triple DES after minimal obfuscation. In most versions, simply discarding the first byte is sufficient.
It is easy to decrypt such passwords using the WS_FTP Password Decryptor utility or a free web service.
When talking about hacking an arbitrary site, they usually mean getting a password from logs and backups of CMS configuration files or e-commerce applications. If you know their typical structure, you can easily specify keywords. Lines like those found in ws_ftp.ini are extremely common. For example, Drupal and PrestaShop have a user ID (UID) and a corresponding password (pwd), and all information is stored in files with the .inc extension. You can search for them as follows:
"pwd =" "UID =" ext: inc
Revealing passwords from DBMS
In the configuration files of SQL servers, names and addresses Email users are stored in clear text, and their MD5 hashes are written instead of passwords. Strictly speaking, it is impossible to decrypt them, but you can find a match among the known hash-password pairs.
Until now, there are DBMSs that do not even use password hashing. The configuration files of any of them can be simply viewed in the browser.
Intext: DB_PASSWORD filetype: env
With the advent of Windows servers the place of configuration files was partly taken by the registry. You can search through its branches in exactly the same way, using reg as the file type. For example, like this:
Filetype: reg HKEY_CURRENT_USER "Password" =
Don't forget the obvious
Sometimes it is possible to get to classified information with the help of accidentally opened and caught in the field of view Google data... Ideally, find a list of passwords in some common format. Store account information in text file, Word document or electronic Excel spreadsheet only desperate people can, but there are always enough of them.
Filetype: xls inurl: password
On the one hand, there are many ways to prevent such incidents. It is necessary to specify adequate access rights in htaccess, patch CMS, do not use left-hand scripts and close other holes. There is also a robots.txt file that prevents search engines from indexing files and directories specified in it. On the other hand, if the robots.txt structure on some server differs from the standard one, then it becomes immediately clear what they are trying to hide on it.
The list of directories and files on any site is preceded by the standard index of. Since for service purposes it must appear in the header, it makes sense to limit its search to the intitle operator. Interesting things are in the / admin /, / personal /, / etc / and even / secret / directories.
Follow the updates
The relevance here is extremely important: old vulnerabilities are being closed very slowly, but Google and its search results are constantly changing. There is even a difference between the “last second” filter (& tbs = qdr: s at the end of the request url) and “real time” (& tbs = qdr: 1).
Date time interval last update file from Google is also implicitly indicated. Through the graphical web interface, you can select one of the typical periods (hour, day, week, and so on) or set a date range, but this method is not suitable for automation.
From the look of the address bar, you can only guess about a way to limit the output of results using the construction & tbs = qdr:. The letter y after it sets the limit of one year (& tbs = qdr: y), m shows the results for the last month, w for the week, d for the past day, h for the last hour, n for the minute, and s for give me a sec. The most recent results just reported to Google are found using the & tbs = qdr: 1 filter.
If you need to write a tricky script, it will be useful to know that the date range is set in Google in Julian format using the daterange operator. For example, this is how you can find the list PDF documents with the word confidential, uploaded from 1st January to 1st July 2015.
Confidential filetype: pdf daterange: 2457024-2457205
The range is specified in Julian date format, excluding the fractional part. Translating them manually from the Gregorian calendar is inconvenient. It's easier to use a date converter.
Targeting and filtering again
In addition to specifying additional operators in the search query, they can be sent directly in the body of the link. For example, the qualification filetype: pdf corresponds to the construction as_filetype = pdf. Thus, it is convenient to specify any clarifications. Suppose that results are returned only from the Republic of Honduras by adding the cr = countryHN construction to the search URL, and only from the city of Bobruisk - gcs = Bobruisk. See the developer section for a complete list.
Google's automation tools are meant to make life easier, but they often add challenges. For example, the user's city is determined by the user's IP through WHOIS. Based on this information, Google not only balances the load between servers, but also changes the search results. Depending on the region, for the same request, the first page will get different results, and some of them may be completely hidden. To feel like a cosmopolitan and to search for information from any country, its two-letter code after the gl = country directive will help. For example, the Netherlands code is NL, but the Vatican and North Korea do not have their own code on Google.
Often, search results are cluttered even after using a few advanced filters. In this case, it is easy to refine the query by adding several exclusion words to it (each of them is preceded by a minus sign). For example, banking, names and tutorial are often used with the word Personal. Therefore, cleaner search results will be shown not by a textbook example of a query, but by a refined one:
Intitle: "Index of / Personal /" -names -tutorial -banking
Last example
A sophisticated hacker is distinguished by the fact that he provides himself with everything he needs on his own. For example, a VPN is convenient, but either expensive or temporary and limited. It's too expensive to subscribe for yourself alone. It's good that there are group subscriptions, and with the help of Google it is easy to become part of a group. To do this, just find the Cisco VPN configuration file, which has a rather non-standard PCF extension and a recognizable path: Program Files \ Cisco Systems \ VPN Client \ Profiles. One request, and you join, for example, the friendly staff of the University of Bonn.
Filetype: pcf vpn OR Group
INFO
Google finds configuration files with passwords, but many of them are encrypted or replaced with hashes. If you see strings of fixed length, then immediately look for a decryption service.Passwords are stored encrypted, but Maurice Massard has already written a program to decrypt them and provides it free of charge through thecampusgeeks.com.
At Google help hundreds of different types of attacks and penetration tests are performed. There are many options, affecting popular programs, major database formats, multiple vulnerabilities in PHP, clouds, and so on. If you have an accurate idea of what you are looking for, this will greatly simplify obtaining the information you need (especially the one that was not planned to be made public). Shodan is not a single source of interesting ideas, but every database of indexed network resources!