Skip to content

Mesh loaders kevin #894

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 23 commits into
base: master
Choose a base branch
from
Open

Mesh loaders kevin #894

wants to merge 23 commits into from

Conversation

kevyuu
Copy link

@kevyuu kevyuu commented Jun 25, 2025

Implement extra geometries

Comment on lines 159 to 164
hlsl::vector<int8_t,3>(0, 0, 127),
hlsl::vector<int8_t,3>(0, 0, 1),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh ? you've made a bug

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah sorry

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its still 1 on the latest commit

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 284 to 285
.format = EF_R16_UINT,
.rangeFormat = IGeometryBase::EAABBFormat::U16
Copy link
Member

@devshgraphicsprogramming devshgraphicsprogramming Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you defined index_t = uint32_t ... then format should be R16 and U16 respectively

Edit: I meant R32 and U32 wote wrong thing by accident

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still wrong, you've got 32bit index but you're saying the index buffer view fromat is 16 bit which will cause the sphere to render wrong

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 823 to 844
auto cylinder = createCylinder(width0, cylinderHeight, tesselationCylinder, vtxColor0);
auto cone = createCone(width1, height-cylinderHeight, tesselationCone, vtxColor1, vtxColor1);

auto cylinderPositions = reinterpret_cast<position_t*>(cylinder->getPositionView().src.buffer->getPointer());
auto conePositions = reinterpret_cast<position_t*>(cone->getPositionView().src.buffer->getPointer());

const auto cylinderNormals = reinterpret_cast<normal_t*>(cylinder->getNormalView().src.buffer->getPointer());
const auto coneNormals = reinterpret_cast<normal_t*>(cone->getNormalView().src.buffer->getPointer());

const auto cylinderUvs = reinterpret_cast<uv_t*>(cylinder->getAuxAttributeViews()->front().src.buffer->getPointer());
const auto coneUvs = reinterpret_cast<uv_t*>(cone->getAuxAttributeViews()->front().src.buffer->getPointer());

const auto cylinderIndices = cylinder->getIndexView().src.buffer->getPointer();
const auto coneIndices = cone->getIndexView().src.buffer->getPointer();

const auto cylinderVertexCount = cylinder->getPositionView().getElementCount();
const auto coneVertexCount = cone->getPositionView().getElementCount();
const auto newArrowVertexCount = cylinderVertexCount + coneVertexCount;

const auto cylinderIndexCount = cylinder->getVertexReferenceCount();
const auto coneIndexCount = cone->getVertexReferenceCount();
const auto newArrowIndexCount = cylinderIndexCount + coneIndexCount;
Copy link
Member

@devshgraphicsprogramming devshgraphicsprogramming Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually. make this function return a geometry collection!

Its a perfect use case! (untouched geometries, just transforms applied)

You can skip rendering the arrow in ex09 (only test with BLAS stuff in 67)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ex 67 does not test the collection

Base automatically changed from mesh_loaders to master July 1, 2025 09:09
@@ -58,11 +58,10 @@ class NBL_API2 CGeometryCreator final : public core::IReferenceCounted
\param colorCone color of the cone
\return Generated mesh.
*/
core::smart_refctd_ptr<ICPUPolygonGeometry> createArrow(const uint32_t tesselationCylinder = 4,
core::vector<core::smart_refctd_ptr<ICPUPolygonGeometry>> createArrow(const uint32_t tesselationCylinder = 4,
Copy link
Member

@devshgraphicsprogramming devshgraphicsprogramming Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, return a ICPUGeometryCollection

P.S. never use containers such as vector for cross-DLL calls (anything thats not inline)

uvs = reinterpret_cast<decltype(uvs)>(buff->getPointer());
shapes::AABB<4, uint16_t> aabb;
aabb.minVx = hlsl::vector<uint8_t, 4>(0,0,0,0);
aabb.maxVx = hlsl::vector<uint8_t, 4>(255,255,0,0);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong max value, might want to use numeric_limits or something to work that out/keep in sync

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 358 to 359
.format = EF_R8G8_UNORM,
.rangeFormat = IGeometryBase::EAABBFormat::U8_NORM

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong formats, should be 16bit

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

tu = static_cast<float>(acos(core::clamp(normal.X / sinay, -1.0, 1.0)) * 0.5 * core::RECIPROCAL_PI<double>());
if (normal.Z < 0.0f)
if (normal.y != -1.0f && normal.y != 1.0f)
tu = static_cast<float>(acos(core::clamp(normal.x / sinay, -1.0, 1.0)) * 0.5 * core::RECIPROCAL_PI<double>());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get rid of core::RECIPROCAL_PI we have some numbers.hlsl header that gives that

do everything in hlsl:: tgmath, also double overkill here, use float in the templates

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

((uint32_t*)tmpMemPtr)[6] = ((uint32_t*)oldTmpMemPtr)[6];
tmpMemPtr += vertexSize;
positions[vertex_i] = positions[old_vertex_i];
uvs[vertex_i] = { 127, uvs[old_vertex_i].y };

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, stop using constant literals, you're making the code exteremely bug prone (you're supposed to have uint16_t max value here, not int8_t max)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would actually be best to use encodePixel with 1.f for X or pack<uint16_t>(1.0) to make sure hardcoding doesn't hurt you

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 18 to 21
static uint8_t packSnorm(float val)
{
return round(hlsl::clamp(val, -1.0f, 1.0f) * 127);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

erm you're using this to encode sphere UV coords which should be unorm16, just use encodePixel or make this templated so one can't make mistakes like this with implicit conversions

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. use encode pixels isntead

const auto f_i = static_cast<float>(i);
hlsl::float32_t3 p(std::cos(f_i * step), std::sin(f_i * step), 0.f);
p *= radius;
const auto n = quantNormalCache->quantize<NormalCacheFormat>(hlsl::normalize(p));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can also p *= radius after you quantize the normal, this way you can skip the normalization

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


positions[i] = { p.x, p.y, p.z };
memcpy(normals + i, &n, sizeof(n));
uvs[i] = { packSnorm(f_i * tesselationRec), packSnorm(0.0) };

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, encoding to int8_t signed fromat for UVs

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


positions[i + halfIx] = { p.x, p.y, length };
normals[i + halfIx] = normals[i];
uvs[i + halfIx] = { packSnorm(1.0f), packSnorm(0.0f) };

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh? why on earth are all the UVs the same at the top of the cylinder?

Comment on lines 613 to 615
constexpr uint32_t RowCount = 2u;
const auto IndexCount = 3 * tesselation;
const auto bytesize = sizeof(index_t) * IndexCount;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is wrong, sizing is like for a cylinder, but this is a cone (only 1 row)

Comment on lines 700 to 726
hlsl::float32_t3 v(std::cos(i * step), 0.0f, std::sin(i * step));
v *= radius;

positions[i] = { v.x, v.y, v.z };
positions[apexVertexBase_i + i] = { apexVertexCoords.x, apexVertexCoords.y, apexVertexCoords.z };

const auto simdPosition = hlsl::float32_t3(positions[i].x, positions[i].y, positions[i].z);
const hlsl::float32_t3 v0ToApex = apexVertexCoords - simdPosition;

uint32_t nextVertexIndex = i == (tesselation - 1) ? 0 : i + 1;
core::vectorSIMDf u1 = core::vectorSIMDf(baseVertices[nextVertexIndex].pos[0], baseVertices[nextVertexIndex].pos[1], baseVertices[nextVertexIndex].pos[2]);
u1 -= core::vectorSIMDf(baseVertices[i].pos[0], baseVertices[i].pos[1], baseVertices[i].pos[2]);
float angleWeight = std::acos(core::dot(core::normalize(apexVertexCoords), core::normalize(u1)).x);
u1 = core::normalize(core::cross(v0ToApex, u1)) * angleWeight;
hlsl::float32_t3 u1 = hlsl::float32_t3(positions[nextVertexIndex].x, positions[nextVertexIndex].y, positions[nextVertexIndex].z);
u1 -= simdPosition;
float angleWeight = std::acos(hlsl::dot(hlsl::normalize(apexVertexCoords), hlsl::normalize(u1)));
u1 = hlsl::normalize(hlsl::cross(v0ToApex, u1)) * angleWeight;

uint32_t prevVertexIndex = i == 0 ? (tesselation - 1) : i - 1;
core::vectorSIMDf u2 = core::vectorSIMDf(baseVertices[prevVertexIndex].pos[0], baseVertices[prevVertexIndex].pos[1], baseVertices[prevVertexIndex].pos[2]);
u2 -= core::vectorSIMDf(baseVertices[i].pos[0], baseVertices[i].pos[1], baseVertices[i].pos[2]);
angleWeight = std::acos(core::dot(core::normalize(apexVertexCoords), core::normalize(u2)).x);
u2 = core::normalize(core::cross(u2, v0ToApex)) * angleWeight;
hlsl::float32_t3 u2 = hlsl::float32_t3(positions[prevVertexIndex].x, positions[prevVertexIndex].y, positions[prevVertexIndex].z);
u2 -= simdPosition;
angleWeight = std::acos(hlsl::dot(hlsl::normalize(apexVertexCoords), hlsl::normalize(u2)));
u2 = hlsl::normalize(hlsl::cross(u2, v0ToApex)) * angleWeight;


const auto baseNormal = quantNormalCache->quantize<NormalCacheFormat>(hlsl::normalize(u1 + u2));
memcpy(normals + i, &baseNormal, sizeof(baseNormal));

baseVertices[i].normal = quantNormalCache->quantize<EF_A2B10G10R10_SNORM_PACK32>(core::normalize(u1 + u2));
apexVertices[i].normal = quantNormalCache->quantize<EF_A2B10G10R10_SNORM_PACK32>(core::normalize(u1));
const auto apexNormal = quantNormalCache->quantize<NormalCacheFormat>(hlsl::normalize(u1));
memcpy(normals + apexVertexBase_i + i, &apexNormal, sizeof(apexNormal));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a cone can't have smooth normals that make sense, don't generate them.

Comment on lines 749 to 760
auto conePositions = reinterpret_cast<position_t*>(cone->getPositionView().src.buffer->getPointer());

const auto coneVertexCount = cone->getPositionView().getElementCount();

for (auto i = 0ull; i < coneVertexCount; ++i)
{
indices[i * 3] = firstIndexOfApexVertices + i;
indices[(i * 3) + 1] = firstIndexOfBaseVertices + i;
indices[(i * 3) + 2] = i == (tesselation - 1) ? firstIndexOfBaseVertices : firstIndexOfBaseVertices + i + 1;
auto& conePosition = conePositions[i];
core::vector3df_SIMD newPos(conePosition.x, conePosition.y, conePosition.z);
newPos.rotateYZByRAD(-1.5707963268);

conePosition = {newPos.x, newPos.y, newPos.z};
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, a geometry collection has transform matrices for each geometry, you shouldn't need to transform vertices in the actual vertex buffer

*/

class Icosphere
{
public:
using index_t = unsigned int;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use sized types like uint32_t

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


retval->setIndexView({
.composed = {
.encodedDataRange = {.u32=aabb},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set the AABB via a visitor or after you set the index view, because when format is u16, you're assining to u32 but you should assign to u16

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 1004 to 1006
unsigned int getPositionCount() const { return (unsigned int)vertices.size() / 3; }
unsigned int getNormalCount() const { return (unsigned int)normals.size() / 3; }
unsigned int getTexCoordCount() const { return (unsigned int)texCoords.size() / 2; }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all these conts must be identical anyway

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

const float* getNormals() const { return normals.data(); }
const float* getTexCoords() const { return texCoords.data(); }
const unsigned int* getIndices() const { return indices.data(); }
const unsigned int* getLineIndices() const { return lineIndices.data(); }

// for interleaved vertices: V/N/T
unsigned int getInterleavedVertexCount() const { return getVertexCount(); } // # of vertices
unsigned int getInterleavedVertexCount() const { return getPositionCount(); } // # of vertices
unsigned int getInterleavedVertexSize() const { return (unsigned int)interleavedVertices.size() * sizeof(float); } // # of bytes
int getInterleavedStride() const { return interleavedStride; } // should be 32 bytes
const float* getInterleavedVertices() const { return interleavedVertices.data(); }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the icosphere can't use interleaved vertex format, must use separated

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

icosphereGeometry.indexType = EIT_32BIT;

return icosphereGeometry;
using uv_t = uint32_t;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use uint16_t2 for clarity

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines -60 to -62
inline core::vectorSIMDu32 getValue() const
hlsl::float32_t3 getValue() const
{
return core::vectorSIMDu32(x,y,z);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this used to return the counterpart to uint32_t4, it could return vector<uint8_t,3> I guess, but definitely not float

Comment on lines -46 to -52
explicit Vector8u3(const core::vectorSIMDu32& val)
explicit Vector8u3(const hlsl::float32_t3& val)
{
operator=(val);
}

Vector8u3& operator=(const Vector8u3&) = default;
Vector8u3& operator=(const core::vectorSIMDu32& val)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as #894 (comment)

storage = val.x|(val.y<<storageBits)|(val.z<<(storageBits*2u));
constexpr auto storageBits = quantizationBits + 1u;
hlsl::uint32_t3 u32_val = { val.x, val.y, val.z };
storage = u32_val.x | (u32_val.y << storageBits) | (u32_val.z << (storageBits * 2u));
return *this;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this used to use the counterpart to uint32_t4, it could do vector<uint16_t,3> I guess, but definitely not float

return core::vectorSIMDu32(storage,storage>>storageBits,storage>>(storageBits*2u))&mask;
constexpr auto storageBits = quantizationBits + 1u;
const auto mask = (0x1u << storageBits) - 1u;
return { storage & mask, (storage >> storageBits) & mask, (storage >> (storageBits * 2)) & mask};
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +405 to +407
//return quantized.
const auto negativeMulVec = hlsl::float32_t3(negativeMask.x ? -1 : 1, negativeMask.y ? -1 : 1, negativeMask.z ? -1 : 1);
return value_type_t<CacheFormat>(negativeMulVec * quantized.getValue());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the old code used to use 2s complement conciously, now you're doing a weird denormalized int- > float -> mul -> retruncate pipeline

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getValue() should return uint16_t, uint32_t, uint64_t (128bit not needed because thats not compressed in any way) representation of the raw bits storing the quantized value

the xorflag should probably be some constexpr static inline

Comment on lines +411 to +422
static inline hlsl::float32_t3 findBestFit(const hlsl::float32_t3& value)
{
static_assert(dimensions>1u,"No point");
static_assert(dimensions<=4u,"High Dimensions are Hard!");
// precise normalize
const auto vectorForDots = value.preciseDivision(length(value));

const auto vectorForDots = hlsl::normalize(value);

//
core::vectorSIMDf fittingVector;
core::vectorSIMDf floorOffset;
hlsl::float32_t3 fittingVector;
hlsl::float32_t3 floorOffset;
constexpr uint32_t cornerCount = (0x1u<<(dimensions-1u))-1u;
core::vectorSIMDf corners[cornerCount] = {};
hlsl::float32_t3 corners[cornerCount] = {};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong type used float32_t3 for everything, should be vector<float32_t,dimensions> instead!

Comment on lines +382 to +386
value_type_t<CacheFormat> quantize(const hlsl::float32_t3& value)
{
const auto negativeMask = value < core::vectorSIMDf(0.0f);
const auto negativeMask = lessThan(value, hlsl::float32_t3(0.0f));

const core::vectorSIMDf absValue = abs(value);
const hlsl::float32_t3 absValue = abs(value);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong type used, vector<float32_t,dimensions> please

@@ -466,18 +467,18 @@ class CDirQuantCacheBase : public virtual core::IReferenceCounted, public impl::
};

constexpr uint32_t cubeHalfSize = (0x1u << quantizationBits) - 1u;
const core::vectorSIMDf cubeHalfSizeND = core::vectorSIMDf(cubeHalfSize);
const hlsl::float32_t3 cubeHalfSizeND = hlsl::float32_t3(cubeHalfSize);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use promote<vector_type> to turn scalars into vectors of the repeated scalar

Comment on lines +6 to +17
constexpr static int32_t findLSB(size_t val)
{
if constexpr(std::is_constant_evaluated())
{
for (size_t ix=0ull; ix<sizeof(size_t)*8; ix++)
if ((0x1ull << ix) & val) return ix;
return ~0u;
} else
{
return hlsl::findLSB(val);
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not fix up the hlsl::findLSB to have the if constexpr(std::is_constant_evaluated()) under ifndef __HLSL_VERSION

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants