Skip to content

BUG: 'utf-8' codec can't decode byte 0x80 in position 8: invalid start byte #1526

Open
@tdruez

Description

@tdruez

Running a map_deploy_to_develop pipeline with the following inputs:

An exception is raised during the match_resources_to_purldb setp.

2025-01-10 00:42:31.088 Step [match_resources_to_purldb] starting
2025-01-10 00:42:31.111 Matching 328 .map, .js, .mjs, .ts, .d.ts, .jsx, .tsx, .css, .scss, .less, .sass, .soy, .class resources in PurlDB, using SHA1
2025-01-10 00:42:31.149 Progress: 10% (33/328)
2025-01-10 00:42:31.157 Progress: 20% (66/328)
2025-01-10 00:42:31.160 Pipeline failed
'utf-8' codec can't decode byte 0x80 in position 8: invalid start byte

Traceback:
  File "/opt/scancodeio/aboutcode/pipeline/__init__.py", line 199, in execute
    step(self)
  File "/opt/scancodeio/scanpipe/pipelines/deploy_to_develop.py", line 231, in match_resources_to_purldb
    d2d.match_purldb_resources(
  File "/opt/scancodeio/scanpipe/pipes/d2d.py", line 639, in match_purldb_resources
    _match_purldb_resources(
  File "/opt/scancodeio/scanpipe/pipes/d2d.py", line 663, in _match_purldb_resources
    for js_sha1 in js.source_content_sha1_list(to_resource):
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/scancodeio/scanpipe/pipes/js.py", line 86, in source_content_sha1_list
    contents = get_map_sources_content(map_file)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/scancodeio/scanpipe/pipes/js.py", line 114, in get_map_sources_content
    if data := load_json_from_file(map_file.location):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/scancodeio/scanpipe/pipes/js.py", line 94, in load_json_from_file
    return json.load(f)
           ^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/json/__init__.py", line 293, in load
    return loads(fp.read(),
                 ^^^^^^^^^
  File "<frozen codecs>", line 322, in decode

This type of issue should not break the whole pipeline execution. We should instead catch the exception, create a project error message, and keep going with the execution.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions