-
During the implementation of our own Mask Autoencoder, we encountered two questions. |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 3 replies
-
Beta Was this translation helpful? Give feedback.
-
Hi @cbe135, the input channel of the mask autoencoder is 8 (see ).This is mainly because we use binary representation to encode the input mask, which saves memory. 8 channels can represent 2**8 (0~255) labels. Each channel represents a bit (see the following function). tutorials/generation/maisi/scripts/utils.py Lines 175 to 190 in 8b90a16 For example, label 1 is encoded as [0, 0, 0, 0, 0, 0, 0, 1]. For your use case, is the label of your dataset covered in the pre-defined label dict? |
Beta Was this translation helpful? Give feedback.
-
Hi @guopengf, thank you. |
Beta Was this translation helpful? Give feedback.
-
Hello @guopengf , my dataset has three categories that are not included in the 132 predefined categories in the label_dict, so I mapped these three categories to dummy6, dummy7, and dummy8. After converting them into the input format compatible with the Mask Autoencoder, encoding the input mask, and then decoding to reconstruct, I found that the reconstructed result differs greatly from the input. Is this normal? Here are illustrative examples of my input and output. |
Beta Was this translation helpful? Give feedback.
Hi @cbe135, the input channel of the mask autoencoder is 8 (see
tutorials/generation/maisi/configs/config_maisi3d-rflow.json
Line 94 in 8b90a16
This is mainly because we use binary representation to encode the input mask, which saves memory. 8 channels can represent 2**8 (0~255) labels. Each channel represents a bit (see the following function).
tutorials/generation/maisi/scripts/utils.py
Lines 175 to 190 in 8b90a16