Skip to content

Commit b4a85c0

Browse files
authored
Merge pull request #418 from CIRCL/domain_object
Domain object
2 parents 6ddd3b8 + 880c351 commit b4a85c0

File tree

82 files changed

+2190
-1738
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

82 files changed

+2190
-1738
lines changed

.gitignore

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,11 @@ var/www/server.crt
3535
var/www/server.key
3636

3737
# Local config
38-
bin/packages/config.cfg
39-
bin/packages/config.cfg.backup
4038
configs/keys
39+
bin/packages/core.cfg
40+
bin/packages/config.cfg.backup
41+
configs/core.cfg
42+
configs/core.cfg.backup
4143
configs/update.cfg
4244
update/current_version
4345
files

HOWTO.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Feed data to AIL:
2525

2626
3. Launch pystemon ``` ./pystemon ```
2727

28-
4. Edit your configuration file ```bin/packages/config.cfg``` and modify the pystemonpath path accordingly
28+
4. Edit your configuration file ```configs/core.cfg``` and modify the pystemonpath path accordingly
2929

3030
5. Launch pystemon-feeder ``` ./bin/feeder/pystemon-feeder.py ```
3131

@@ -123,7 +123,7 @@ There are two types of installation. You can install a *local* or a *remote* Spl
123123
(for a linux docker, the localhost IP is *172.17.0.1*; Should be adapted for other platform)
124124
- Restart the tor proxy: ``sudo service tor restart``
125125

126-
3. *(AIL host)* Edit the ``/bin/packages/config.cfg`` file:
126+
3. *(AIL host)* Edit the ``/configs/core.cfg`` file:
127127
- In the crawler section, set ``activate_crawler`` to ``True``
128128
- Change the IP address of Splash servers if needed (remote only)
129129
- Set ``splash_onion_port`` according to your Splash servers port numbers that will be used.
@@ -134,7 +134,7 @@ There are two types of installation. You can install a *local* or a *remote* Spl
134134

135135
- *(Splash host)* Launch all Splash servers with:
136136
```sudo ./bin/torcrawler/launch_splash_crawler.sh -f <config absolute_path> -p <port_start> -n <number_of_splash>```
137-
With ``<port_start>`` and ``<number_of_splash>`` matching those specified at ``splash_onion_port`` in the configuration file of point 3 (``/bin/packages/config.cfg``)
137+
With ``<port_start>`` and ``<number_of_splash>`` matching those specified at ``splash_onion_port`` in the configuration file of point 3 (``/configs/core.cfg``)
138138

139139
All Splash dockers are launched inside the ``Docker_Splash`` screen. You can use ``sudo screen -r Docker_Splash`` to connect to the screen session and check all Splash servers status.
140140

@@ -148,7 +148,7 @@ All Splash dockers are launched inside the ``Docker_Splash`` screen. You can use
148148
- ```crawler_hidden_services_install.sh -y```
149149
- Add the following line in ``SOCKSPolicy accept 172.17.0.0/16`` in ``/etc/tor/torrc``
150150
- ```sudo service tor restart```
151-
- set activate_crawler to True in ``/bin/packages/config.cfg``
151+
- set activate_crawler to True in ``/configs/core.cfg``
152152
#### Start
153153
- ```sudo ./bin/torcrawler/launch_splash_crawler.sh -f $AIL_HOME/configs/docker/splash_onion/etc/splash/proxy-profiles/ -p 8050 -n 1```
154154

@@ -166,4 +166,3 @@ Then starting the crawler service (if you follow the procedure above)
166166
##### Python 3 Upgrade
167167

168168
To upgrade from an existing AIL installation, you have to launch [python3_upgrade.sh](./python3_upgrade.sh), this script will delete and create a new virtual environment. The script **will upgrade the packages but won't keep your previous data** (neverthless the data is copied into a directory called `old`). If you install from scratch, you don't require to launch the [python3_upgrade.sh](./python3_upgrade.sh).
169-

OVERVIEW.md

Lines changed: 35 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -261,6 +261,9 @@ Redis and ARDB overview
261261
| set_pgpdump_name:*name* | *item_path* |
262262
| | |
263263
| set_pgpdump_mail:*mail* | *item_path* |
264+
| | |
265+
| | |
266+
| set_domain_pgpdump_**pgp_type**:**key** | **domain** |
264267

265268
##### Hset date:
266269
| Key | Field | Value |
@@ -288,11 +291,20 @@ Redis and ARDB overview
288291
| item_pgpdump_name:*item_path* | *name* |
289292
| | |
290293
| item_pgpdump_mail:*item_path* | *mail* |
294+
| | |
295+
| | |
296+
| domain_pgpdump_**pgp_type**:**domain** | **key** |
291297

292298
#### Cryptocurrency
293299

294300
Supported cryptocurrency:
295301
- bitcoin
302+
- bitcoin-cash
303+
- dash
304+
- etherum
305+
- litecoin
306+
- monero
307+
- zcash
296308

297309
##### Hset:
298310
| Key | Field | Value |
@@ -303,7 +315,8 @@ Supported cryptocurrency:
303315
##### set:
304316
| Key | Value |
305317
| ------ | ------ |
306-
| set_cryptocurrency_**cryptocurrency name**:**cryptocurrency address** | **item_path** |
318+
| set_cryptocurrency_**cryptocurrency name**:**cryptocurrency address** | **item_path** | PASTE
319+
| domain_cryptocurrency_**cryptocurrency name**:**cryptocurrency address** | **domain** | DOMAIN
307320

308321
##### Hset date:
309322
| Key | Field | Value |
@@ -318,8 +331,14 @@ Supported cryptocurrency:
318331
##### set:
319332
| Key | Value |
320333
| ------ | ------ |
321-
| item_cryptocurrency_**cryptocurrency name**:**item_path** | **cryptocurrency address** |
334+
| item_cryptocurrency_**cryptocurrency name**:**item_path** | **cryptocurrency address** | PASTE
335+
| domain_cryptocurrency_**cryptocurrency name**:**item_path** | **cryptocurrency address** | DOMAIN
322336

337+
#### HASH
338+
| Key | Value |
339+
| ------ | ------ |
340+
| hash_domain:**domain** | **hash** |
341+
| domain_hash:**hash** | **domain** |
323342

324343
## DB9 - Crawler:
325344

@@ -362,6 +381,20 @@ Supported cryptocurrency:
362381
}
363382
```
364383

384+
##### CRAWLER QUEUES:
385+
| SET - Key | Value |
386+
| ------ | ------ |
387+
| onion_crawler_queue | **url**;**item_id** | RE-CRAWL
388+
| regular_crawler_queue | - |
389+
| | |
390+
| onion_crawler_priority_queue | **url**;**item_id** | USER
391+
| regular_crawler_priority_queue | - |
392+
| | |
393+
| onion_crawler_discovery_queue | **url**;**item_id** | DISCOVER
394+
| regular_crawler_discovery_queue | - |
395+
396+
##### TO CHANGE:
397+
365398
ARDB overview
366399

367400
----------------------------------------- SENTIMENT ------------------------------------

bin/Decoder.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818

1919
from Helper import Process
2020
from packages import Paste
21+
from packages import Item
2122

2223
import re
2324
import signal
@@ -120,6 +121,12 @@ def save_hash(decoder_name, message, date, decoded):
120121
serv_metadata.zincrby('nb_seen_hash:'+hash, message, 1)# hash - paste map
121122
serv_metadata.zincrby(decoder_name+'_hash:'+hash, message, 1) # number of b64 on this paste
122123

124+
# Domain Object
125+
if Item.is_crawled(message):
126+
domain = Item.get_item_domain(message)
127+
serv_metadata.sadd('hash_domain:{}'.format(domain), hash) # domain - hash map
128+
serv_metadata.sadd('domain_hash:{}'.format(hash), domain) # hash - domain map
129+
123130

124131
def save_hash_on_disk(decode, type, hash, json_data):
125132

bin/Helper.py

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,10 +20,10 @@
2020
import json
2121

2222

23-
class PubSub(object):
23+
class PubSub(object): ## TODO: remove config, use ConfigLoader by default
2424

2525
def __init__(self):
26-
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
26+
configfile = os.path.join(os.environ['AIL_HOME'], 'configs/core.cfg')
2727
if not os.path.exists(configfile):
2828
raise Exception('Unable to find the configuration file. \
2929
Did you set environment variables? \
@@ -58,7 +58,6 @@ def setup_subscribe(self, conn_name):
5858
for address in addresses.split(','):
5959
new_sub = context.socket(zmq.SUB)
6060
new_sub.connect(address)
61-
# bytes64 encode bytes to ascii only bytes
6261
new_sub.setsockopt_string(zmq.SUBSCRIBE, channel)
6362
self.subscribers.append(new_sub)
6463

@@ -112,7 +111,7 @@ def subscribe(self):
112111
class Process(object):
113112

114113
def __init__(self, conf_section, module=True):
115-
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
114+
configfile = os.path.join(os.environ['AIL_HOME'], 'configs/core.cfg')
116115
if not os.path.exists(configfile):
117116
raise Exception('Unable to find the configuration file. \
118117
Did you set environment variables? \

bin/LAUNCH.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -222,7 +222,7 @@ function launching_scripts {
222222

223223
function launching_crawler {
224224
if [[ ! $iscrawler ]]; then
225-
CONFIG=$AIL_BIN/packages/config.cfg
225+
CONFIG=$AIL_HOME/configs/core.cfg
226226
lport=$(awk '/^\[Crawler\]/{f=1} f==1&&/^splash_port/{print $3;exit}' "${CONFIG}")
227227

228228
IFS='-' read -ra PORTS <<< "$lport"

bin/MISP_The_Hive_feeder.py

Lines changed: 11 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -8,20 +8,20 @@
88
This module send tagged pastes to MISP or THE HIVE Project
99
1010
"""
11-
12-
import redis
13-
import sys
1411
import os
12+
import sys
13+
import uuid
14+
import redis
1515
import time
1616
import json
17-
import configparser
1817

1918
from pubsublogger import publisher
2019
from Helper import Process
2120
from packages import Paste
2221
import ailleakObject
2322

24-
import uuid
23+
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
24+
import ConfigLoader
2525

2626
from pymisp import PyMISP
2727

@@ -133,26 +133,10 @@ def feeder(message, count=0):
133133

134134
config_section = 'MISP_The_hive_feeder'
135135

136-
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
137-
if not os.path.exists(configfile):
138-
raise Exception('Unable to find the configuration file. \
139-
Did you set environment variables? \
140-
Or activate the virtualenv.')
141-
142-
cfg = configparser.ConfigParser()
143-
cfg.read(configfile)
136+
config_loader = ConfigLoader.ConfigLoader()
144137

145-
r_serv_db = redis.StrictRedis(
146-
host=cfg.get("ARDB_DB", "host"),
147-
port=cfg.getint("ARDB_DB", "port"),
148-
db=cfg.getint("ARDB_DB", "db"),
149-
decode_responses=True)
150-
151-
r_serv_metadata = redis.StrictRedis(
152-
host=cfg.get("ARDB_Metadata", "host"),
153-
port=cfg.getint("ARDB_Metadata", "port"),
154-
db=cfg.getint("ARDB_Metadata", "db"),
155-
decode_responses=True)
138+
r_serv_db = config_loader.get_redis_conn("ARDB_DB")
139+
r_serv_metadata = config_loader.get_redis_conn("ARDB_Metadata")
156140

157141
# set sensor uuid
158142
uuid_ail = r_serv_db.get('ail:uuid')
@@ -212,7 +196,9 @@ def feeder(message, count=0):
212196

213197
refresh_time = 3
214198
## FIXME: remove it
215-
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], cfg.get("Directories", "pastes"))
199+
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "pastes"))
200+
config_loader = None
201+
216202
time_1 = time.time()
217203

218204
while True:

bin/Mixer.py

Lines changed: 14 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -29,16 +29,20 @@
2929
The mapping can be done via the variable FEED_QUEUE_MAPPING
3030
3131
"""
32+
import os
33+
import sys
34+
3235
import base64
3336
import hashlib
34-
import os
3537
import time
3638
from pubsublogger import publisher
3739
import redis
38-
import configparser
3940

4041
from Helper import Process
4142

43+
sys.path.append(os.path.join(os.environ['AIL_BIN'], 'lib/'))
44+
import ConfigLoader
45+
4246

4347
# CONFIG #
4448
refresh_time = 30
@@ -52,37 +56,22 @@
5256

5357
p = Process(config_section)
5458

55-
configfile = os.path.join(os.environ['AIL_BIN'], 'packages/config.cfg')
56-
if not os.path.exists(configfile):
57-
raise Exception('Unable to find the configuration file. \
58-
Did you set environment variables? \
59-
Or activate the virtualenv.')
60-
61-
cfg = configparser.ConfigParser()
62-
cfg.read(configfile)
59+
config_loader = ConfigLoader.ConfigLoader()
6360

6461
# REDIS #
65-
server = redis.StrictRedis(
66-
host=cfg.get("Redis_Mixer_Cache", "host"),
67-
port=cfg.getint("Redis_Mixer_Cache", "port"),
68-
db=cfg.getint("Redis_Mixer_Cache", "db"),
69-
decode_responses=True)
70-
71-
server_cache = redis.StrictRedis(
72-
host=cfg.get("Redis_Log_submit", "host"),
73-
port=cfg.getint("Redis_Log_submit", "port"),
74-
db=cfg.getint("Redis_Log_submit", "db"),
75-
decode_responses=True)
62+
server = config_loader.get_redis_conn("Redis_Mixer_Cache")
63+
server_cache = config_loader.get_redis_conn("Redis_Log_submit")
7664

7765
# LOGGING #
7866
publisher.info("Feed Script started to receive & publish.")
7967

8068
# OTHER CONFIG #
81-
operation_mode = cfg.getint("Module_Mixer", "operation_mode")
82-
ttl_key = cfg.getint("Module_Mixer", "ttl_duplicate")
83-
default_unnamed_feed_name = cfg.get("Module_Mixer", "default_unnamed_feed_name")
69+
operation_mode = config_loader.get_config_int("Module_Mixer", "operation_mode")
70+
ttl_key = config_loader.get_config_int("Module_Mixer", "ttl_duplicate")
71+
default_unnamed_feed_name = config_loader.get_config_str("Module_Mixer", "default_unnamed_feed_name")
8472

85-
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], p.config.get("Directories", "pastes")) + '/'
73+
PASTES_FOLDER = os.path.join(os.environ['AIL_HOME'], config_loader.get_config_str("Directories", "pastes")) + '/'
74+
config_loader = None
8675

8776
# STATS #
8877
processed_paste = 0

0 commit comments

Comments
 (0)