Google Dorks For OSINT




There are entire books dedicated to Google searching and Google hacking. Most of these focus on penetration testing and securing computer networks. These are full of great information, but are often overkill for the investigator looking for quick personal information. A few simple rules can help locate more accurate data.


Search Operators



Most search engines allow the use of commands within the search field. These commands are not actually part of the search terms and are referred to as operators. There are two parts to most operator searches, and each are separated by a colon. To the left of the colon is the type of operator, such as "site" (website) or "ext" (file extension). To the right is the rule for the operator, such as the target domain or file type. This post will explain each operator and the most appropriate uses.


The Site Operator

The site operator asks Google to search within one website or domain. This operator provides two benefits to the search results. First, it will only provide results of pages located on a specific domain. Second, it will provide all of the results containing the search terms on that domain. If you want to view every page on a specific domain that includes your target of interest, the site operator is required.


To see how many pages Google has indexed for a page, enter the following query


site:eforensicsmag.com  


But how many of these are blog posts? Let us find out


site:eforensicsmag.com/blog


NoteGoogle only gives a rough approximation when using this operator. For the full picture, check Google Search Console.


Next, I conducted the following exact search.


site:eforensicsmag.com "Joseph Moronwi"    


The result was all eleven pages on eforensicsmag.com that include my name within the content. This technique can be applied to any domain. This includes social networks, blogs, and any other website that is indexed by search engines. 


To view the subdomains of the target website, enter the following query


site:*.google.com -www


To find unsecure pages of a target domain, enter the following query


site:google.com -inurl:https


Using the same operator, you can restrict your search within one domain type. An example usage is given below.


computer forensics site:gov


This searches for the term computer forensics in all websites with the .gov domain. Jung Kim showed a nice dork to find people within GitHub:


site:github.com/orgs/*/people


Simply replace the asterik (*) with the organisation's name to target people from a specific organization


The Filetype Operator

Another operator that works with both Google and Bing is the filetype filter. It allows you to filter any search results by a single file type extension. While Google allows this operator to be shortened to "ext", Bing does not. When using the filetype suffix with your search terms, Google will restrict the results to web pages that end with this extension.


Consider the following search attempting to locate PDF files associated with the terror group ISIS.


"ISIS" filetype:pdf


There are many uses for this technique. A search of filetype:doc "resume" "target name" often provides resumes created by the target which can include cellular telephone numbers, personal addresses, work history, education information, references, and other personal information that would never be intentionally posted to the internet. The "filetype" operator can identify any file by the file type within any website. This can be combined with the "site" operator to find all files of any type on a single domain. By conducting the following searches, I was able to find several documents stored on the website cnn.com


site:cnn.com filetype:pdf
site:cnn.com filetype:pptx
site:cnn.com filetype:doc


Previously, Google and Bing indexed media files by type, such as MP3, MP4, AVI, and others. Due to abuse of pirated content, this no longer works well. The following extensions have been found to be indexed and provide valuable results. 


7Z: Compressed File 
BMP: Bitmap Image 
DOC: Microsoft Word 
DOCX: Microsoft Word 
DWF: Autodesk 
GIF: Animated Image 
HTM: Web Page 
HTML: Web Page 
JPG: Image 
Hyphen (-) 
JPEG: Image 
KML: Google Earth 
KMZ: Google Earth 
ODP: OpenOffice 
Presentation 
ODS: OpenOffice 
Spreadsheet 
ODT: OpenOffice Text 
PDF: Adobe Acrobat 
PNG: Image 
PPT: Microsoft PowerPoint 
PPTX: Microsoft PowerPoint 
RAR: Compressed File 
RTF: Rich Text Format 
TXT: Text File 
XLS: Microsoft Excel 
XLSX: Microsoft Excel 
ZIP: Compressed File  


The Exclusion operator (-)

You may want to exclude some content from appearing within results. The hyphen (-) tells most search engines and social networks to exclude the text immediately following from any results. It is important to never include a space between the hyphen and filtered text. The following query shows all links to my blog excluding the internal links.


site:* digitalinvestigator.blogspot.com -site:digitalinvestigator.blogspot.com


My goal in search filters is to dwindle the total results to a manageable amount. When you are overwhelmed with search results, slowly add exclusions to make an impact on the amount of data to analyze. 


The InURL Operator

Previously, the operators discussed applied to the content within the web page. This search operator, however, will search for a specific word or phrase inside the URL of a web page. Using suitable keywords for the title in the URL, rather than getting a lot of irrelevant data, the inurl search operator is very useful and helpful. My favourite search using this technique is to find File Transfer Protocol (FTP) servers that allow anonymous connections.


The following search would identify any FTP servers that possess PDF files that contain the term terror within the file


inurl:ftp -inurl(http|https) filetype:pdf "terror"


Obviously, this operator could also be used to locate standard web pages, documents, and files. 


In an investigation, you might want to check if your target left a resume online. Simply enter the query below


inurl:curriculum vitae "Julian Assange"


You can add all to this search to force all listed words to appear in any order. For example, enter 


allinurl: OSINT intelligence


and Google will return pages with the terms OSINT intelligence in their URLs


The InTitle Operator

This operator will filter web pages by details other than the actual content of the page. This filter will only present web pages that have specific content within the title of the page. Practically every web page on the internet has an official title for the page. This is often included within the source code of the page and may not appear anywhere within the content. Most webmasters carefully create a title that will be best indexed by search engines.


If you conduct a search for "business email compromise" on Google, you will receive 552,000 results. However, the following search will filter those to 6,150. These only include web pages that had the search terms within the limited space of a page title.


intitle:"business email compromise"


You can add all to this search to force all listed words to appear in any order. The following would find any sites that have the words business, email, and compromise within the title, regardless of the order


allintitle:"business email compromise"


An interesting way to use this search technique is while searching for online folders. We often focus on finding websites or files of interest, but we tend to ignore the presence of online folders full of content related to our search. As an example, I conducted the following search on Google. 


intitle:index.of OSINT>


The results contain online folders that usually do not have typical website files within the folders. Each possess dozens of documents and other files related to our search term of OSINT. Some provides a folder structure that allows access to an entire web server of content. Notice that none of these results points to a specific page, but all open a folder view of the data present.




The intext operator

The query intext:term restricts results to documents containing term in the text. This is a very helpful Google dorks search operator. By using intext, you can search and get a glimpse of the material of a web page without having to open it. Generally, we use the shortcut key, that is, CTRL + F to search the term which we are looking for. But by using intext, we will get the results only with the term which we used in the intext search.

For example - We are going to search for the web series Tom Clancy's Jack Ryan. I just want to search and gather more information about the series, characters, etc. The appropriate query is given below

intext:"jack ryan"

This will display all the results that have Jack Ryan in the content of the web page. In an investigation, you might want to check if your target left a resume online. Simply enter the query below


intext:curriculum vitae "Julian Assange"


You can add all to this search to force all listed words to appear in any order. For example, enter 


allintext:TOR Dark markets


and Google will only return the pages that have the three terms TOR and Dark and markets within its text.


The OR operator

You may have search terms that are not definitive. You may have a target that has a unique last name that is often misspelled. The OR operator in capital letters only—also written as a vertical bar (|)— returns pages that have just A, just B, or both A and B. For example, entering 


DFIR OR OSINT

or entering

DFIR|OSINT

will retrieve pages that contain either the term DFIR or the term OSINT.


The Wildcard operator

The asterisk (*) represents one or more words to Google and is considered a wild card. Google treats the * as a placeholder for a word or words within a search string. For example, "DFIR * training" tells Google to find pages containing a phrase that starts with "DFIR" followed by one or more words, followed by "training".  


Let’s say that you are looking into a person-of-interest and the only information you have about this individual is a username: JoseffMoro. While a search for JoseffMoro might return other places that the username shows up online, by using the Wildcard Operator and searching for 


JoseffMoro*com


we can instead see if any email addresses or other personal details appear publicly online that use the username as the unique identifier.



While this will not always return significantly different results to searching the username itself, it can be used as a quick way to identify an email address that can later be tied to other accounts.


The Range Operator

The "Range Operator" tells Google to search between two identifiers. These could be sequential numbers or years. As an example,


OSINT Training 2015 .. 2018


would result in pages that include the terms OSINT and training, and also include any number between 2015 and 2018. I have used this to filter results for online news articles that include a commenting system where readers can express their views. The following search identifies websites that contain information about Deborah Samuel, a Nigerian female Christian student lynched to death by her male Muslim colleagues on accusation of blasphemy against Islam in May 2022, and between 1 and 999 comments within the page. 


"Deborah Samuel" "1...999 comment"


The Related Operator

To search for similar web pages, use the related operator. It collects a domain, and attempts to provide online content related to that address. As an example, I conducted a search on Google with the following syntax.


related:google.com


The results included no references to that domain but did associate it with other search engines.


The Cache operator

The cache operator enables users to return the most recently cached version of a webpage when the web page has been indexed. Investigators can use the cache operator to locate previous versions of edited or deleted web pages to locate removed intelligence.


cache:eforensicsmag.com


The Map operator

The map operator enables users to force Google to show map results for a locational search. The results show only location-specific data and do not include recent news stories. Investigators can use the map operator to focus on geospatial relevant intelligence.


map:Johannesburg


It is important to note here that these operators can be combined in ways that the OSINT investigator deems fit. Some example usage are considered as follows.


To find guest blogging opportunities, you can combine operators as follows


digital forensics intitle:"write for us" inurl:"write-for-us"


This uncovers so-called “write for us” pages in the digital forensics niche


If you know of a serial guest blogger in your niche, try this:


Kronos Banking Trojan intext:"Marcus Hutchins" inurl:"author" -site:evilsite.com


Got someone in mind that you want to reach out to on social media? Try this trick to find their contact details:


Brett Shavers dfir training (site:twitter.com | site:facebook.com | site:linkedin.com)


To find Q+A threads related to your target term, enter the following query


Satoshi Nakamoto site:quora.com intitle:(TOR | "Bitcoin" | "cryptocurrency market")


By focusing on the LinkedIn site, you can look for people with a certain job title and a certain location. There is a trick that can prove useful, which is that you can search for icons or Unicode characters:


site:linkedin.com/in “<job title>” (☎ OR ☏ OR ✆ OR 📱) +”<location>”

It is also possible to search for a specific target name


“<name>” (☎ OR ☏ OR ✆ OR 📱)

You can search for copies of databases via Google too. To find some of them, simply search for:


ext:sql intext:"-- phpMyAdmin SQL Dump"


To search for excel files in a target organisation that have the word contact in their URL, you can enter the following query (Replace google.com with your target domain).


filetype:xls site:google.com inurl:contact


This yield web pages that have contact list from the target organisation.


The power of Google Dorks or search operators in investigations is in combining them. The reader is encouraged to explore various possible combinations of search operators to achieve the desired results.


Post a Comment

Previous Post Next Post