Ctiscan datamodel

Design

Our historic data model needed refactoring. We could have changed and improved it, but that would have impacted all current users and customers. So, we decided to craft an all new data model, along with a brand new information category dedicated to threat hunters. That’s why we have focused on including lots of pivots and technical indicators in this new data model.

Some key features of this new design:


The main idea is to store every data field in a layer. Each layer is dedicated to providing information from a set of related concepts. For instance, you will find all IP-related data at the ip layer. The same is true for TCP.

In a break from our historical approach, we decided to split the application layer into different parts. The app layer is where you’ll find information on application responses, without protocol related information. For instance, you will find app.data.text where the raw application response is stored. The app.protocol field is set according to the specifics of the raw application response, and a related new layer is created. For example, if app.protocol is set to http, an http layer will be created.

It’s also possible to have chained layers. By analyzing the http layer, you may find a favicon. The favicon data and attributes are stored in the favicon layer, and not within http layer. The same is true for tls or other certificate data, which again is stored in a separate layer. In the end, depending on the detected protocol, you may have a cert layer, a favicon layer, a ja3 layer or a jarm layer added.

There are also some platform-specific layers, like the scanner layer which contains information about which of our scanners collected the data. The dns layer has been split into different parts, one part for reverse DNS information and another part for forward DNS information.

One of the most powerful layers may be component. This is where you can find information about all detected hardware and software components, along with detected products, vendors and CPEs.

At the root of this data model you can find a few general fields, like source or tag, however we prefered to put the vast majority of fields in a dedicated layer.

One last comment; every field has a defined type which allows for some search techniques. Keyword type, the most used, allows for searching with exact strings, wildcards or even regexps.

That’s it, for an overview of how we decided to split the data and make it easier to understand and search for. Now, if you want to know more, you can go and read every specific detail in this page, if you have the time :)

Sample queries

Search information about an IP address

category:ctiscan ip.dest:8.8.8.8

Search a specific port exposure

category:ctiscan tcp.dest:3389

Search information on a specific domain, with DNS resolution

category:ctiscan dns.domain:onyphe.io

Search information on a specific domain, with or without DNS resolution, from certificate

category:ctiscan cert.domain:traefik.default

Search information on a specific domain, from reverse DNS resolution

category:ctiscan dns.reverse.domain:onyphe.io

Search a specific product exposure

category:ctiscan component.text:ivanti

Search for a simple string in application response

category:ctiscan app.data.text:MeshCentral

Search for a simple string within HTML title

category:ctiscan html.title.text:MeshCentral

Search for an IP address with a list of open ports

category:ctiscan services.port:80 services.port:3389 services.port:5984

Search for an IP address with a list of open ports and a specific JA4T fingerprint

category:ctiscan services.port:80 services.port:3389 services.port:5984 ja4t.fingerprint.md5:“87752e7bdd0cd90559f7504835342639”

Search a specific certificate subject distinguished name

category:ctiscan cert.subject.dn:“C=IN, ST=Delhi, L=New Delhi, O=FREECHARGE PAYMENT TECHNOLOGIES PRIVATE LIMITED, CN=*.freecharge.in”

Search a specific certificate subject distinguished name against a regexp

category:ctiscan app.protocol:rdp -regexp:cert.subject.dn,cn=[a-z0-9]{15}

Search a specific serial number in certificate

category:ctiscan cert.serial.num:146473198

Search a favicon hash

Search an HTTP header server against a given string

category:ctiscan http.header.server:“Apache/2.4.29 (Ubuntu)”

Search a specific HTTP header etag value

category:ctiscan http.header.etag:“644e8f8f-cd”

Search hash of HTTP headers, order of headers is kept

category:ctiscan http.header.data.md5:“2b72b1cfb2353e7024a445697ca93534”

Search for a specific domain with a specific string within application response

category:ctiscan dns.domain:google.com app.data.text:google

Search for a specific protocol

category:ctiscan app.protocol:rdp

Search for a specific kind of device

category:ctiscan app.device:c2

Search information on any entity

category:ctiscan entity.text:bank

Search using nested booleans, we want either port 8080 or 1723 open, but not when port 80 or 443 is also open

NOTE: requires OQLv2 enabled user account.

category:ctiscan ( ?services.port:8080 ?services.port:1723 ) ( !services.port:80 !services.port:443 )

Layers

root layer

@category

@timestamp

tag

source

scanner layer

scanner.name

scanner.country

scanner.lcountry

ip layer

ip.version

ip.ttl

ip.src

ip.dest

ip.netname

ip.asn

ip.organization

ip.subnet

ip.country

ip.latitude

ip.longitude

ip.location

ip.lasn

ip.lorganization

ip.lsubnet

ip.lcountry

ip.llatitude

ip.llongitude

ip.llocation

tcp layer

tcp.src

tcp.dest

tcp.rtt

tcp.cpe

tcp.vendor

tcp.product

tcp.fingerprint.raw

tcp.fingerprint.md5

tcp.options

tcp.window

app layer

app.device

app.protocol

app.transport

app.tls

app.data.text

app.data.length

app.data.md5

app.data.mmh3

extract layer

extract.url

extract.domain

extract.hostname

extract.file

extract.ip

dns layer

dns.hostname

dns.domain

dns.idomain

dns.tld

dns.host

dns.reverse.hostname

dns.reverse.domain

dns.reverse.idomain

dns.reverse.tld

dns.reverse.host

dns.forward.hostname

dns.forward.domain

dns.forward.idomain

dns.forward.tld

dns.forward.host

cert layer

cert.hostname

cert.domain

cert.idomain

cert.tld

cert.host

cert.serial.hex

cert.serial.num

cert.validity.notbefore

cert.validity.notafter

cert.fingerprint.md5

cert.fingerprint.sha1

cert.fingerprint.sha256

cert.issuer.dn

cert.issuer.cn

cert.issuer.an

cert.issuer.o

cert.issuer.ou

cert.issuer.c

cert.issuer.l

cert.issuer.st

cert.issuer.e

cert.subject.dn

cert.subject.cn

cert.subject.an

cert.subject.o

cert.subject.ou

cert.subject.c

cert.subject.l

cert.subject.st

cert.subject.e

entity layer

entity.raw

entity.text

entity.count

ja4t layer

ja4t.fingerprint.raw

ja4t.fingerprint.md5

component layer

component.text

component.cpe

component.vendor

component.product

component.version

component.patch

component.distribution

component.count

http layer

http.version

http.code

http.url

http.defang

http.undefang

http.header.data.md5

http.header.data.mmh3

http.body.data.md5

http.body.data.mmh3

NOTE only keyword types below, so no .raw or .text. If you need full-text search here, you should use app.data.text field.

http.header.etag

http.header.lastmodified

http.header.wwwauthenticate

http.header.realm

http.header.cookie

http.header.contentlength

http.vhost

redirect layer

redirect.type

redirect.src

redirect.dest

html layer

html.title.raw

html.title.text

html.keywords.raw

html.keywords.text

html.description.raw

html.description.text

html.copyright.raw

html.copyright.text

ROADMAP html.ssdeep

ROADMAP html.domhash

tracker layer

tracker.ga

tracker.gaw

tracker.gtm

tracker.gpub

tracker.fbq

tracker.snaptr

tracker.newrelic

ftp layer

ftp.anonymous

favicon layer

favicon.url

favicon.data.base64

favicon.data.length

favicon.data.md5

favicon.data.mmh3

hassh layer

hassh.fingerprint.raw

hassh.fingerprint.md5

ssh layer

ssh.fingerprint.md5

ssh.fingerprint.sha1

ssh.fingerprint.sha256

ROADMAP: ja3s layer

ROADMAP: ja3s.fingerprint.raw

ROADMAP: ja3s.fingerprint.md5

ROADMAP: ja4s layer

ROADMAP: ja4s.fingerprint.raw

ROADMAP: ja4s.fingerprint.md5

ROADMAP: jarm layer

ROADMAP: jarm.fingerprint.raw

ROADMAP: jarm.fingerprint.md5

ROADMAP: jarm.hello1.ja3s.raw

ROADMAP: jarm.hello1.ja3s.md5

ROADMAP: jarm.hello2.ja3s.raw

ROADMAP: jarm.hello2.ja3s.md5

ROADMAP: jarm.hello3.ja3s.raw

ROADMAP: jarm.hello3.ja3s.md5

ROADMAP: jarm.hello4.ja3s.raw

ROADMAP: jarm.hello4.ja3s.md5

ROADMAP: jarm.hello5.ja3s.raw

ROADMAP: jarm.hello5.ja3s.md5

ROADMAP: jarm.hello6.ja3s.raw

ROADMAP: jarm.hello6.ja3s.md5

ROADMAP: jarm.hello7.ja3s.raw

ROADMAP: jarm.hello7.ja3s.md5

ROADMAP: jarm.hello8.ja3s.raw

ROADMAP: jarm.hello8.ja3s.md5

ROADMAP: jarm.hello9.ja3s.raw

ROADMAP: jarm.hello9.ja3s.md5

ROADMAP: jarm.hello10.ja3s.raw

ROADMAP: jarm.hello10.ja3s.md5