ONYPHE Query Language (OQL)

OQL can be used with the following APIs:


It allows to search for data using filters and boolean operators. A number of integrations exist in various languages if you want to avoid developping your own integration with our APIs. See integrations chapter.

You can either use it from the CLI tools or from the Web interface which leverages the Search API under the hood.

General OQL syntax

The syntax is the following:

category:<CATEGORY> filter1:<VALUE1> filter2:<VALUE2> -<FUNCTION1>:<FUNCTION_VALUE1> -<FUNCTION2>:<FUNCTION_VALUE2>


Examples:


category:datascan domain:google.com protocol:rdp -monthago:3


category:datascan device.class:"vpn server"

NOTE: field values are NOT case sensitive, while fields ARE case sensitive but always available as lowercase.

NOTE2: if you need to pass values containing space characters, you have to enclose values with double-quotes. Examples: device.class:"vpn server", device.class:database.

Supported boolean operators

OQL supports the following boolean operators:


NOTE: OR boolean operator is a feature available starting from Lion Views.

Examples:


category:datascan protocol:rdp domain:google.com


category:datascan domain:google.com !organization:google


category:datascan ?protocol:rdp ?protocol:ssh domain:google.com

By default, all fields are searchable with exact values only. That means you have to correctly enter the value for a filter. For instance, to search against protocol:rdp, you have to give the exact rdp string.

For specific fields, you can search for words in a full-text index. The following list are the full-text enabled fields in the datascan/riskscan/vulnscan data model:


Fields with full-text search enabled in Ctiscan are indicated with a .text suffix. For example: app.data.text


Therefore, only the aforementioned list of fields can be used to perform full-text searches, all the others only accept exact values.

Examples:


category:datascan app.http.title:confluence


category:datascan app.http.component.productvendor:"Atlassian" app.http.component.product:"Confluence"

Listing all available filters

You can either navigate through the Web interface to find the fields that you need to refine your search, either from displayed tabs or from the JSON tab. In fact, all fields displayed in JSON output can be used as filters except reserved fields prefixed with an @ such as @timestamp.

IP vs CIDR or network searches

When you need to find assets on a specific network block, you can use CIDR notation. However, to avoid performing I/O intensive searches, you cannot specify networks larger than /16. You may use the splitsubnet CLI procedure to auto-split CIDR searches in smaller subnets.


category:datascan ip:8.8.8.8


category:datascan ip:8.8.8.0/24


category:datascan asn:AS15169

Not all fields support CIDR searches. The following fields are capable of that:


NOTE: the subnet field is NOT capable of CIDR searches, you have to pivot from this field value and use the value against the ip field.

How hostnames are split

Our approach for building an Attack Surface Discovery & Attack Surface Management inventory is domain-based. To achieve that goal, we split hostnames (or Fully-Qualified-Domain-Names, or sometimes called subdomains) into different components. Thus, we split a FQDN into the following fields:

In the end, when you don’t want to know how to query for a specific domain-based value, you can always perform an OR query:


category:datascan ?domain:sam.probe.onyphe.net ?subdomains:sam.probe.onyphe.net ?hostname:sam.probe.onyphe.net

NOTE: to perform this split, we rely on a list of TLDs built from IANA the list. Our list is also available on our Github.

Search functions

To refine your searches, we have a number of functions available. They may help you identify assets exposed in the past, reverse the sort results or refine your search for specific assets.

NOTE: functions are only available with Enterprise licenses.

Time range functions

These functions allows you to search through historical data.

-hourago

Query data collected some hours ago. The use case is to automate your searches every hour to search for specific gems on previous hour of collected information.


category:datascan protocol:rdp -hourago:1


To query the current hour:


category:datascan protocol:rdp -hourago:0


NOTE: an hour starts at minute 00 and ends at minute 59.

NOTE2: you can increment the hour counter to as much as your license allows for. For Lynx Views, that number may be up-to 30-days of data, so -hourago:720.

-dayago

In the same way, you may want to execute searches at the day granularity level. To query the previous day of data:


category:datascan protocol:rdp -dayago:1


To query current day:


category:datascan protocol:rdp -dayago:0


NOTE: a day starts at 00:00 hour and ends at 23:59 hour.

NOTE2: you can increment the day counter to as much as your license allows it. For Lynx Views, that number may be up-to 30-days of data, so -dayago:30.

-weekago

Same as before, at the week granularity level. To query previous week of data:


category:datascan protocol:rdp -weekago:1


To query current week:


category:datascan protocol:rdp -weekago:0


NOTE: a week starts on Monday at 00:00 and ends on Sunday at 23:59.

NOTE2: you can increment the week counter to as much as your license allows it. For Lynx Views, that number may be up-to 30-days of data, so -weekago:4.

-monthago

Same as before, at the month granularity level. To query previous month of data:


category:datascan protocol:rdp -monthago:1


To query current month:


category:datascan protocol:rdp -monthago:0


NOTE: a month starts the 1st at 00:00 and ends last day of the month at 23:59.

NOTE2: you can increment the month counter to as much as your license allows it. For Lion Views, that number may be up-to 90-days of data, so -monthago:3.

-since

Sometimes, you may want to query on the full time range allowed by your license. Please note that this function is subject to some limitations based on your license.

For instance, Eagle Views can use the -since:7M from Search API but not from the Export API. Griffin Views can use the full time range on all APIs, up-to 48 months of historical data for the relevant categories. To search for all exposed rdp services on the full 7-month time range:


category:datascan protocol:rdp -since:7M

Wildcard functions

OQL also has the capability to search using wildcards. This is possible only against exact search fields, not against full-text search fields. Also, these functions have the same limitations as the -since function, you can only use it against last 30-days of data for Eagle Views but on full time range for Griffin Views.

Wildcards accept the same syntax as usual UNIX shells:


-wildcard

The syntax for wildcard functions is as follows:

category:<CATEGORY> -wildcard:<FIELD_NAME>,<SEARCH STRING>

category:<CATEGORY> -wildcard:<FIELD_NAME>,"<SEARCH STRING>" # with quotes if the string contains spaces

One of the use cases for wildcard searches is to identify typosquatting or phishing hostnames or domains. You may want to identify domains that look like yours, or to search against all TLDs for a given domain:


category:resolver -wildcard:domain,g??gle.com !domain:google.com


category:datascan -wildcard:hostname,*.google.com.* -notwildcard:domain,google.*

WARNING: this request is I/O intensive. You may receive request timeout errors. Feel free to relaunch your search until it succeeds.


category:resolver -wildcard:domain,google.*

-orwildcard

You may also want to pass multiple wildcard conditions. Simply replace your -wildcard functions with multiple -orwildcard functions:


category:resolver -orwildcard:domain,g?ogle.com -orwildcard:domain,googl?.com !domain:google.com

-notwildcard

You can even exclude some wildcards:


category:resolver -orwildcard:domain,g?ogle.* -orwildcard:domain,googl?.* -notwildcard:domain,google.*

Regular Expressions

-regexp

Similar in syntax to wildcard functions, -regexp allows for powerful queries within exact match fields. Regular expressions can’t be used against full-text search enabled fields. The Ctiscan data model includes both exact match (.raw suffix) and full-text versions (.text suffix) of certain key fields, such as the HTML title. This allows for either full-text or regular expression searches against those fields.

The syntax for regexp functions is as follows:

category:<CATEGORY> -regexp:<FIELD_NAME>,"<REGULAR_EXPRESSION>"


category:ctl -regexp:domain,"g[^\\.o]ogle[a-z0-9-]*\\.[a-z\.]{1,}" -since:1w

Escaped special characters within the expression, must themselves be escaped within OQL. So therefore a regular expression for a full-stop/period character requires two back-slashes (\\) to be correctly interpreted.

WARNING: regexp requests can be I/O intensive. You may receive request timeout errors. Feel free to relaunch your search until it succeeds.

-orregexp

You may also want to pass multiple regular expressions conditions or combine a regexp function with OR conditions. Simply replace your -regexp functions with multiple -orregexp functions:


category:ctl -orregexp:domain,"g[^\\.o]ogle[a-z0-9-]*\\.[a-z\.]{1,}" -orregexp:domain,"a[^\\.p]ple\\.[a-z\.]{1,}" -since:1w

-notregexp

Results can be excluded by regular expression using -notregexp.


category:ctl -regexp:domain,"g[^\\.o]ogle[a-z0-9-]*\\.[a-z\.]{1,}" -notregexp:tld,"xyz|ru|co\\.uk|edu.*"

Other functions

-exists

The use case for this function is to identify assets which a specific field set. For instance, you may want to identify assets with a CVE identified, whatever the CVE is. datascan & vulnscan categories are the most interesting categories to use that function against.


category:datascan domain:google.com -exists:cve


category:vulnscan domain:google.com -exists:cve

-notexists

Does the opposite of -exists function. For instance, you may want to check an asset has been scanned for vulnerabilities and that they are not vulnerable.


category:vulnscan domain:google.com -notexists:cve

-orexists

You may also want to search for different existing fields with the -orexists function. A use case would be to search for an existing CVE or an existing product:


category:vulnscan domain:google.com -orexists:cve -orexists:cpe

-fields

This function has been designed to reduce the volume of data before applying some local processing or to integrate within a SIEM where license price is based on volume of indexed data. Sometimes, you may only be interested in identifying IP addresses from a specific search, thus you want to receive only the ip field as a result.


category:datascan ?port:3389 ?port:3390 ?port:3391 -fields:ip,port

-sort

By default, latest result is displayed first on output. In some cases, you want to identify the older result.


category:datascan app.http.title:"How to Restore Your Files" -since:7M -sort:0

-tlsexpired

Forgetting to renew a certificate is a thing. Also, entreprises not decommissioning assets is a thing. By searching for expired certificates, you can find lost treasure.


category:datascan domain:google.com -tlsexpired:1

Dorkpedia

You may be wondering how you can search for specific products or devices? The dorkpedia is for you. You also have a list of dorks to help you identify the most important risks exposed by your assets.

OQLv2 (version 2)

Although retro-compatible with OQLv1 queries, OQLv2 is a full rewrite of the ONYPHE application engine which allows for new and more-powerful features. In this initial version the following capabilities have been added:


OQLv2 features are available for ASM-level and Ctiscan licences. See the Pricing page or contact us for more information.

Condition groups

As with OQLv1, Boolean conditions in OQLv2 are specified as follows:


Condition groups allow for precedence when the query is parsed and executed. Parenthenses are used to start and end a group, with a leading space required within the group.

The syntax is as follows:

category:<CATEGORY> ( ?filter1:<VALUE1> ?filter2:<VALUE2> -<OR_FUNCTION1>:<FUNCTION_VALUE1> ) filter3:<VALUE3> -<FUNCTION2>:<FUNCTION_VALUE2>

In this example, the expression within the parentheses is executed and then joined as a Boolean AND with the other filters and functions in the query. Multiple Condition groups can be specified, as follows:

category:<CATEGORY> ( ?filter1:<VALUE1> ?filter2:<VALUE2> -<OR_FUNCTION1>:<FUNCTION_VALUE1> ) ( ?filter3:<VALUE3> ?filter4:<VALUE4> -<NOT_FUNCTION2>:<FUNCTION_VALUE2> )


category:ctiscan ( ?app.device:medical ?app.device:scada app.protocol:http ) ( ?cert.tld:nl ?dns.tld:nl -orwildcard:dns.tld,*uk -orwildcard:cert.tld,*uk )


category:riskscan ( ?device.class:medical ?device.class:scada protocol:http ) ( ?tld:nl -orwildcard:tld,*uk )

New error conditions