Mesh loaders kevin #894

kevyuu · 2025-06-25T11:30:57Z

Implement extra geometries

src/nbl/asset/utils/CGeometryCreator.cpp

devshgraphicsprogramming · 2025-06-25T11:32:40Z

src/nbl/asset/utils/CGeometryCreator.cpp

-			hlsl::vector<int8_t,3>(0, 0, 127),
+			hlsl::vector<int8_t,3>(0, 0, 1),


huh ? you've made a bug

its still 1 on the latest commit

devshgraphicsprogramming · 2025-06-25T14:01:18Z

src/nbl/asset/utils/CGeometryCreator.cpp

+				.format = EF_R16_UINT,
+				.rangeFormat = IGeometryBase::EAABBFormat::U16


you defined index_t = uint32_t ... then format should be ~~R16 and U16 respectively~~

Edit: I meant R32 and U32 wote wrong thing by accident

still wrong, you've got 32bit index but you're saying the index buffer view fromat is 16 bit which will cause the sphere to render wrong

src/nbl/asset/utils/CGeometryCreator.cpp

devshgraphicsprogramming · 2025-06-25T14:08:38Z

src/nbl/asset/utils/CGeometryCreator.cpp

+  auto cylinder = createCylinder(width0, cylinderHeight, tesselationCylinder, vtxColor0);
+  auto cone = createCone(width1, height-cylinderHeight, tesselationCone, vtxColor1, vtxColor1);
+
+	auto cylinderPositions = reinterpret_cast<position_t*>(cylinder->getPositionView().src.buffer->getPointer());
+	auto conePositions = reinterpret_cast<position_t*>(cone->getPositionView().src.buffer->getPointer());
+
+	const auto cylinderNormals = reinterpret_cast<normal_t*>(cylinder->getNormalView().src.buffer->getPointer());
+	const auto coneNormals = reinterpret_cast<normal_t*>(cone->getNormalView().src.buffer->getPointer());
+
+	const auto cylinderUvs = reinterpret_cast<uv_t*>(cylinder->getAuxAttributeViews()->front().src.buffer->getPointer());
+	const auto coneUvs = reinterpret_cast<uv_t*>(cone->getAuxAttributeViews()->front().src.buffer->getPointer());
+
+	const auto cylinderIndices = cylinder->getIndexView().src.buffer->getPointer();
+	const auto coneIndices = cone->getIndexView().src.buffer->getPointer();
+
+	const auto cylinderVertexCount = cylinder->getPositionView().getElementCount();
+	const auto coneVertexCount = cone->getPositionView().getElementCount();
+	const auto newArrowVertexCount = cylinderVertexCount + coneVertexCount;
+
+	const auto cylinderIndexCount = cylinder->getVertexReferenceCount();
+	const auto coneIndexCount = cone->getVertexReferenceCount();
+	const auto newArrowIndexCount = cylinderIndexCount + coneIndexCount;


Actually. make this function return a geometry collection!

Its a perfect use case! (untouched geometries, just transforms applied)

You can skip rendering the arrow in ex09 (only test with BLAS stuff in 67)

ex 67 does not test the collection

devshgraphicsprogramming · 2025-07-17T12:13:49Z

include/nbl/asset/utils/CGeometryCreator.h

@@ -58,11 +58,10 @@ class NBL_API2 CGeometryCreator final : public core::IReferenceCounted
 		\param colorCone color of the cone
 		\return Generated mesh.
 		*/
-		core::smart_refctd_ptr<ICPUPolygonGeometry> createArrow(const uint32_t tesselationCylinder = 4,
+		core::vector<core::smart_refctd_ptr<ICPUPolygonGeometry>> createArrow(const uint32_t tesselationCylinder = 4,


no, return a ICPUGeometryCollection

P.S. never use containers such as vector for cross-DLL calls (anything thats not inline)

devshgraphicsprogramming · 2025-07-17T12:36:56Z

src/nbl/asset/utils/CGeometryCreator.cpp

+			uvs = reinterpret_cast<decltype(uvs)>(buff->getPointer());
+			shapes::AABB<4, uint16_t> aabb;
+			aabb.minVx = hlsl::vector<uint8_t, 4>(0,0,0,0);
+			aabb.maxVx = hlsl::vector<uint8_t, 4>(255,255,0,0);


wrong max value, might want to use numeric_limits or something to work that out/keep in sync

devshgraphicsprogramming · 2025-07-17T12:37:27Z

src/nbl/asset/utils/CGeometryCreator.cpp

+					.format = EF_R8G8_UNORM,
+					.rangeFormat = IGeometryBase::EAABBFormat::U8_NORM


wrong formats, should be 16bit

devshgraphicsprogramming · 2025-07-17T12:38:43Z

src/nbl/asset/utils/CGeometryCreator.cpp

-					tu = static_cast<float>(acos(core::clamp(normal.X / sinay, -1.0, 1.0)) * 0.5 * core::RECIPROCAL_PI<double>());
-				if (normal.Z < 0.0f)
+				if (normal.y != -1.0f && normal.y != 1.0f)
+					tu = static_cast<float>(acos(core::clamp(normal.x / sinay, -1.0, 1.0)) * 0.5 * core::RECIPROCAL_PI<double>());


get rid of core::RECIPROCAL_PI we have some numbers.hlsl header that gives that

do everything in hlsl:: tgmath, also double overkill here, use float in the templates

devshgraphicsprogramming · 2025-07-17T12:41:40Z

src/nbl/asset/utils/CGeometryCreator.cpp

-			((uint32_t*)tmpMemPtr)[6] = ((uint32_t*)oldTmpMemPtr)[6];
-			tmpMemPtr += vertexSize;
+			positions[vertex_i] = positions[old_vertex_i];
+			uvs[vertex_i] = { 127, uvs[old_vertex_i].y };


again, stop using constant literals, you're making the code exteremely bug prone (you're supposed to have uint16_t max value here, not int8_t max)

would actually be best to use encodePixel with 1.f for X or pack<uint16_t>(1.0) to make sure hardcoding doesn't hurt you

devshgraphicsprogramming · 2025-07-17T12:42:55Z

src/nbl/asset/utils/CGeometryCreator.cpp

+static uint8_t packSnorm(float val)
+{
+	return round(hlsl::clamp(val, -1.0f, 1.0f) * 127);
+}


erm you're using this to encode sphere UV coords which should be unorm16, just use encodePixel or make this templated so one can't make mistakes like this with implicit conversions

Removed. use encode pixels isntead

devshgraphicsprogramming · 2025-07-17T12:48:14Z

src/nbl/asset/utils/CGeometryCreator.cpp

+		const auto f_i = static_cast<float>(i);
+		hlsl::float32_t3 p(std::cos(f_i * step), std::sin(f_i * step), 0.f);
+		p *= radius;
+		const auto n = quantNormalCache->quantize<NormalCacheFormat>(hlsl::normalize(p));


you can also p *= radius after you quantize the normal, this way you can skip the normalization

devshgraphicsprogramming · 2025-07-17T12:49:02Z

src/nbl/asset/utils/CGeometryCreator.cpp

+
+		positions[i] = { p.x, p.y, p.z };
+		memcpy(normals + i, &n, sizeof(n));
+		uvs[i] = { packSnorm(f_i * tesselationRec), packSnorm(0.0) };


again, encoding to int8_t signed fromat for UVs

devshgraphicsprogramming · 2025-07-17T12:49:25Z

src/nbl/asset/utils/CGeometryCreator.cpp

+
+		positions[i + halfIx] = { p.x, p.y, length };
+		normals[i + halfIx] = normals[i];
+		uvs[i + halfIx] = { packSnorm(1.0f), packSnorm(0.0f) };


huh? why on earth are all the UVs the same at the top of the cylinder?

devshgraphicsprogramming · 2025-07-17T12:50:56Z

src/nbl/asset/utils/CGeometryCreator.cpp

+		constexpr uint32_t RowCount = 2u;
+		const auto IndexCount = 3 * tesselation;
+		const auto bytesize = sizeof(index_t) * IndexCount;


this is wrong, sizing is like for a cylinder, but this is a cone (only 1 row)

devshgraphicsprogramming · 2025-07-17T12:52:52Z

src/nbl/asset/utils/CGeometryCreator.cpp

+		hlsl::float32_t3 v(std::cos(i * step), 0.0f, std::sin(i * step));
+		v *= radius;
+
+		positions[i] = { v.x, v.y, v.z };
+		positions[apexVertexBase_i + i] = { apexVertexCoords.x, apexVertexCoords.y, apexVertexCoords.z };
+
+		const auto simdPosition = hlsl::float32_t3(positions[i].x, positions[i].y, positions[i].z);
+		const hlsl::float32_t3 v0ToApex = apexVertexCoords - simdPosition;

 		uint32_t nextVertexIndex = i == (tesselation - 1) ? 0 : i + 1;
-		core::vectorSIMDf u1 = core::vectorSIMDf(baseVertices[nextVertexIndex].pos[0], baseVertices[nextVertexIndex].pos[1], baseVertices[nextVertexIndex].pos[2]);
-		u1 -= core::vectorSIMDf(baseVertices[i].pos[0], baseVertices[i].pos[1], baseVertices[i].pos[2]);
-		float angleWeight = std::acos(core::dot(core::normalize(apexVertexCoords), core::normalize(u1)).x);
-		u1 = core::normalize(core::cross(v0ToApex, u1)) * angleWeight;
+		hlsl::float32_t3 u1 = hlsl::float32_t3(positions[nextVertexIndex].x, positions[nextVertexIndex].y, positions[nextVertexIndex].z);
+		u1 -= simdPosition;
+		float angleWeight = std::acos(hlsl::dot(hlsl::normalize(apexVertexCoords), hlsl::normalize(u1)));
+		u1 = hlsl::normalize(hlsl::cross(v0ToApex, u1)) * angleWeight;

 		uint32_t prevVertexIndex = i == 0 ? (tesselation - 1) : i - 1;
-		core::vectorSIMDf u2 = core::vectorSIMDf(baseVertices[prevVertexIndex].pos[0], baseVertices[prevVertexIndex].pos[1], baseVertices[prevVertexIndex].pos[2]);
-		u2 -= core::vectorSIMDf(baseVertices[i].pos[0], baseVertices[i].pos[1], baseVertices[i].pos[2]);
-		angleWeight = std::acos(core::dot(core::normalize(apexVertexCoords), core::normalize(u2)).x);
-		u2 = core::normalize(core::cross(u2, v0ToApex)) * angleWeight;
+		hlsl::float32_t3 u2 = hlsl::float32_t3(positions[prevVertexIndex].x, positions[prevVertexIndex].y, positions[prevVertexIndex].z);
+		u2 -= simdPosition;
+		angleWeight = std::acos(hlsl::dot(hlsl::normalize(apexVertexCoords), hlsl::normalize(u2)));
+		u2 = hlsl::normalize(hlsl::cross(u2, v0ToApex)) * angleWeight;
+
+
+		const auto baseNormal = quantNormalCache->quantize<NormalCacheFormat>(hlsl::normalize(u1 + u2));
+		memcpy(normals + i, &baseNormal, sizeof(baseNormal));

-		baseVertices[i].normal = quantNormalCache->quantize<EF_A2B10G10R10_SNORM_PACK32>(core::normalize(u1 + u2));
-		apexVertices[i].normal = quantNormalCache->quantize<EF_A2B10G10R10_SNORM_PACK32>(core::normalize(u1));
+		const auto apexNormal = quantNormalCache->quantize<NormalCacheFormat>(hlsl::normalize(u1));
+		memcpy(normals + apexVertexBase_i + i, &apexNormal, sizeof(apexNormal));


a cone can't have smooth normals that make sense, don't generate them.

devshgraphicsprogramming · 2025-07-17T12:53:49Z

src/nbl/asset/utils/CGeometryCreator.cpp

+	auto conePositions = reinterpret_cast<position_t*>(cone->getPositionView().src.buffer->getPointer());
+
+	const auto coneVertexCount = cone->getPositionView().getElementCount();
+
+	for (auto i = 0ull; i < coneVertexCount; ++i)
 	{
-		indices[i * 3] = firstIndexOfApexVertices + i;
-		indices[(i * 3) + 1] = firstIndexOfBaseVertices + i;
-		indices[(i * 3) + 2] = i == (tesselation - 1) ? firstIndexOfBaseVertices : firstIndexOfBaseVertices + i + 1;
+		auto& conePosition = conePositions[i];
+		core::vector3df_SIMD newPos(conePosition.x, conePosition.y, conePosition.z);
+		newPos.rotateYZByRAD(-1.5707963268);
+
+		conePosition = {newPos.x, newPos.y, newPos.z};
 	}


no, a geometry collection has transform matrices for each geometry, you shouldn't need to transform vertices in the actual vertex buffer

devshgraphicsprogramming · 2025-07-17T12:54:09Z

src/nbl/asset/utils/CGeometryCreator.cpp

 */

 class Icosphere
 {
 public:
+	using index_t = unsigned int;


please use sized types like uint32_t

devshgraphicsprogramming · 2025-07-17T12:55:46Z

src/nbl/asset/utils/CGeometryCreator.cpp

+
+		retval->setIndexView({
+			.composed = {
+				.encodedDataRange = {.u32=aabb},


set the AABB via a visitor or after you set the index view, because when format is u16, you're assining to u32 but you should assign to u16

devshgraphicsprogramming · 2025-07-17T12:57:11Z

src/nbl/asset/utils/CGeometryCreator.cpp

+	unsigned int getPositionCount() const { return (unsigned int)vertices.size() / 3; }
 	unsigned int getNormalCount() const { return (unsigned int)normals.size() / 3; }
 	unsigned int getTexCoordCount() const { return (unsigned int)texCoords.size() / 2; }


all these conts must be identical anyway

devshgraphicsprogramming · 2025-07-17T12:58:08Z

src/nbl/asset/utils/CGeometryCreator.cpp

 	const float* getNormals() const { return normals.data(); }
 	const float* getTexCoords() const { return texCoords.data(); }
 	const unsigned int* getIndices() const { return indices.data(); }
 	const unsigned int* getLineIndices() const { return lineIndices.data(); }

 	// for interleaved vertices: V/N/T
-	unsigned int getInterleavedVertexCount() const { return getVertexCount(); }    // # of vertices
+	unsigned int getInterleavedVertexCount() const { return getPositionCount(); }    // # of vertices
 	unsigned int getInterleavedVertexSize() const { return (unsigned int)interleavedVertices.size() * sizeof(float); }    // # of bytes
 	int getInterleavedStride() const { return interleavedStride; }   // should be 32 bytes
 	const float* getInterleavedVertices() const { return interleavedVertices.data(); }


the icosphere can't use interleaved vertex format, must use separated

devshgraphicsprogramming · 2025-07-17T13:07:14Z

src/nbl/asset/utils/CGeometryCreator.cpp

-	icosphereGeometry.indexType = EIT_32BIT;
-
-	return icosphereGeometry;
+			using uv_t = uint32_t;


use uint16_t2 for clarity

devshgraphicsprogramming · 2025-07-17T13:08:59Z

include/nbl/asset/utils/CDirQuantCacheBase.h

-				inline core::vectorSIMDu32 getValue() const
+        hlsl::float32_t3 getValue() const
 				{
-					return core::vectorSIMDu32(x,y,z);


this used to return the counterpart to uint32_t4, it could return vector<uint8_t,3> I guess, but definitely not float

devshgraphicsprogramming · 2025-07-17T13:09:31Z

include/nbl/asset/utils/CDirQuantCacheBase.h

-				explicit Vector8u3(const core::vectorSIMDu32& val)
+				explicit Vector8u3(const hlsl::float32_t3& val)
 				{
 					operator=(val);
 				}

 				Vector8u3& operator=(const Vector8u3&) = default;
-				Vector8u3& operator=(const core::vectorSIMDu32& val)


same as #894 (comment)

devshgraphicsprogramming · 2025-07-17T13:12:27Z

include/nbl/asset/utils/CDirQuantCacheBase.h

-					storage = val.x|(val.y<<storageBits)|(val.z<<(storageBits*2u));
+					constexpr auto storageBits = quantizationBits + 1u;
+					hlsl::uint32_t3 u32_val = { val.x, val.y, val.z };
+					storage = u32_val.x | (u32_val.y << storageBits) | (u32_val.z << (storageBits * 2u));
 					return *this;


this used to use the counterpart to uint32_t4, it could do vector<uint16_t,3> I guess, but definitely not float

devshgraphicsprogramming · 2025-07-17T13:12:39Z

include/nbl/asset/utils/CDirQuantCacheBase.h

-					return core::vectorSIMDu32(storage,storage>>storageBits,storage>>(storageBits*2u))&mask;
+					constexpr auto storageBits = quantizationBits + 1u;
+					const auto mask = (0x1u << storageBits) - 1u;
+					return { storage & mask, (storage >> storageBits) & mask, (storage >> (storageBits * 2)) & mask};
 				}


same as https://github.com/Devsh-Graphics-Programming/Nabla/pull/894/files#r2213314777

devshgraphicsprogramming · 2025-07-17T13:15:06Z

include/nbl/asset/utils/CDirQuantCacheBase.h

+			//return quantized.
+			const auto negativeMulVec = hlsl::float32_t3(negativeMask.x ? -1 : 1, negativeMask.y ? -1 : 1, negativeMask.z ? -1 : 1);
+      return value_type_t<CacheFormat>(negativeMulVec * quantized.getValue());


the old code used to use 2s complement conciously, now you're doing a weird denormalized int- > float -> mul -> retruncate pipeline

getValue() should return uint16_t, uint32_t, uint64_t (128bit not needed because thats not compressed in any way) representation of the raw bits storing the quantized value

the xorflag should probably be some constexpr static inline

devshgraphicsprogramming · 2025-07-17T13:18:22Z

include/nbl/asset/utils/CDirQuantCacheBase.h

+		static inline hlsl::float32_t3 findBestFit(const hlsl::float32_t3& value)
 		{
 			static_assert(dimensions>1u,"No point");
 			static_assert(dimensions<=4u,"High Dimensions are Hard!");
-			// precise normalize
-			const auto vectorForDots = value.preciseDivision(length(value));
+
+			const auto vectorForDots = hlsl::normalize(value);

 			//
-			core::vectorSIMDf fittingVector;
-			core::vectorSIMDf floorOffset;
+			hlsl::float32_t3 fittingVector;
+			hlsl::float32_t3 floorOffset;
 			constexpr uint32_t cornerCount = (0x1u<<(dimensions-1u))-1u;
-			core::vectorSIMDf corners[cornerCount] = {};
+			hlsl::float32_t3 corners[cornerCount] = {};


wrong type used float32_t3 for everything, should be vector<float32_t,dimensions> instead!

devshgraphicsprogramming · 2025-07-17T13:18:50Z

include/nbl/asset/utils/CDirQuantCacheBase.h

+		value_type_t<CacheFormat> quantize(const hlsl::float32_t3& value)
 		{
-			const auto negativeMask = value < core::vectorSIMDf(0.0f);
+			const auto negativeMask = lessThan(value, hlsl::float32_t3(0.0f));

-			const core::vectorSIMDf absValue = abs(value);
+			const hlsl::float32_t3 absValue = abs(value);


wrong type used, vector<float32_t,dimensions> please

devshgraphicsprogramming · 2025-07-17T13:19:41Z

include/nbl/asset/utils/CDirQuantCacheBase.h

@@ -466,18 +467,18 @@ class CDirQuantCacheBase : public virtual core::IReferenceCounted, public impl::
 			};

 			constexpr uint32_t cubeHalfSize = (0x1u << quantizationBits) - 1u;
-			const core::vectorSIMDf cubeHalfSizeND = core::vectorSIMDf(cubeHalfSize);
+			const hlsl::float32_t3 cubeHalfSizeND = hlsl::float32_t3(cubeHalfSize);


use promote<vector_type> to turn scalars into vectors of the repeated scalar

devshgraphicsprogramming · 2025-07-17T13:22:43Z

src/nbl/asset/ECommonEnums.cpp

+constexpr static int32_t findLSB(size_t val)
+{
+	if constexpr(std::is_constant_evaluated())
+	{
+		for (size_t ix=0ull; ix<sizeof(size_t)*8; ix++)
+			if ((0x1ull << ix) & val) return ix;
+		return ~0u;
+	} else
+	{
+		return hlsl::findLSB(val);
+	}
+}


why not fix up the hlsl::findLSB to have the if constexpr(std::is_constant_evaluated()) under ifndef __HLSL_VERSION

kevyuu added 2 commits June 25, 2025 18:29

Implement createCylinder, createCone, createSphere, CreateArrow

4a37ba5

Implement getIndexType convenience function for IPolygonGeometry

2d8b7c4

devshgraphicsprogramming reviewed Jun 25, 2025

View reviewed changes

src/nbl/asset/utils/CGeometryCreator.cpp Outdated Show resolved Hide resolved