Skip to content

[Bug] Crawling discrepancies and inconsistencies #8

@BreakfastSerial

Description

@BreakfastSerial

I'm currently evaluation a number of solutions, similar to HTTPLoot and I've come across discrepancies and inconsistencies when using the tool.

The test setup is very basic and consists of a locally hosted, static website.

index.html

<html>
<script src=cool.js></script>
<h1>hello, I have a cool file</h1>
<a href=coolfile.txt>coolfile</a>
<style>bye</style>
</html>

The coolfile.txt and cool.js are just:

var secret="secret123";
var jwt="secret123";
var apitoken="secret123";
var AWS_ACCESS_KEY_ID="ASIAY34FZKBOKMUTVV7A";
var password="secret123";
var db_password="secret123";

This is hosted, using python (python3 -m http.server 80), so it's globally accessible via the standard http port.

The problem:

Now I've noticed, that if I use localhost IP (127.0.0.1), the crawling is limited to the index.html. Neither coolfile.txt, nor cool.js are requested.

If using the "external" IP (192.168.0.10), cool.js is requested and coolfile.txt is only sometimes requested.

I've deleted the results, between each approach. I found no way to reproduce this behavior consistently.

Crawling should be done the same, independent of the IP address and I assume the same resources should be requested for each iteration of tool execution.

I run HTTPLoot, in the latest v0.1 release from Jun 16, 2022, as follows:

localhost:~/httploot# ./httploot-linux64 http://192.168.0.10
# or
localhost:~/httploot# ./httploot-linux64 http://127.0.0.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions