Image upload benchamark #238

CrabExtra · 2025-12-24T15:15:16Z

Added simple Image uploading benchmark

73_ImageUploadBenchmark/main.cpp

… FIF

Erfan-Ahmadi · 2025-12-25T09:08:05Z

73_ImageUploadBenchmark/main.cpp

+		{
+			IGPUImage::SCreationParams imgParams{};
+			imgParams.type = IImage::E_TYPE::ET_2D;
+			imgParams.extent.width = TILE_SIZE * 32;


this should depend on your TILES_PER_FRAME.

Erfan-Ahmadi · 2025-12-25T09:28:26Z

73_ImageUploadBenchmark/main.cpp

+
+			uint32_t bufferOffset = cmdBufIndex * partitionSize;
+			void* targetPtr = static_cast<uint8_t*>(mappedPtr) + bufferOffset;
+			memcpy(targetPtr, cpuSourceData, partitionSize);


you need to check if the mapped pointer's buffer was created on HOST_COHERENT memory. if not, you need to flush the range to be available for the gpu

Erfan-Ahmadi · 2025-12-25T09:35:21Z

73_ImageUploadBenchmark/main.cpp

+	}
+
+private:
+	void transitionImageLayout(


I know it's a PITA to have to write barriers, but trust me, if we could have a a better simple transition/barrier function, we would've made it.

what you can do is create a default one barrier. where your subresourceRange is fixed or image is fixed and start from modifying that each time. you're going to have different dst/src+access/stage mask combinations for different barriers each time

Pro Tip: Enable synchronization validation by overriding getAPIFeaturesToEnable() and look for possible missing barriers and sync from vulkan validation layer bugs.

virtual video::IAPIConnection::SFeatures getAPIFeaturesToEnable() override { auto retval = base_t::getAPIFeaturesToEnable(); retval.validations = true; retval.synchronizationValidation = true; return retval; }

Erfan-Ahmadi · 2025-12-25T09:36:54Z

73_ImageUploadBenchmark/main.cpp

+				data[i] = g();
+		}
+
+		auto regionsPerFrame = new IImage::SBufferCopy*[framesInFlight];


we're not going to blame you if you want to use std::vector

Erfan-Ahmadi · 2025-12-25T09:48:56Z

73_ImageUploadBenchmark/main.cpp

+
+		uint64_t timelineValue = 0;
+
+		commandBuffers[0]->begin(IGPUCommandBuffer::USAGE::ONE_TIME_SUBMIT_BIT);


It's ok to do this if the "end-purpose" of this image was to be a destination image and nothing else :D
but it's going to be used in the shaders as well. so what's the first thing you'll think of doing? transition to TRANSFER__DST for the copy and SHADER_READ_OPTIMAL after the copy. it makes sense for a single-upload image which you're going to use in GPU in many many future frames. but we're potentially uploading to this image every frame and it would be redudndant to do this image layout transitions each time.
that's why we asked you to keep the gpu image in GENERAL layout.

another note:

pipeline barriers are not for image layout transitions only. you still need them to ensure correct synchronization and access to the resource.
so: keep the image as GENERAL layout + add 2 pipeline barriers (1 before the transfer, 1 after) the one after is a "Fake" or "mock" one for now. it's as if it's going to be read by a shader soon.
hint: the second barrier after the transfer should have SHADER_READ_BIT as access in FRAGMENT stage (with no layout transition, i.e GENERAL->GENERAL)

you'll most likely get a sync validation error (assuming you turned it on) if you don't have any pipeline barrier for this destination image

Erfan-Ahmadi · 2025-12-25T09:56:08Z

73_ImageUploadBenchmark/main.cpp

+
+			commandBuffers[cmdBufIndex]->begin(IGPUCommandBuffer::USAGE::ONE_TIME_SUBMIT_BIT);
+			commandBuffers[cmdBufIndex]->copyBufferToImage(
+				stagingBuffer,


ok good, single copy call for all tiles. remember to change this to GENERAL after https://github.com/Devsh-Graphics-Programming/Nabla-Examples-and-Tests/pull/238/files#r2646807113

Erfan-Ahmadi · 2025-12-25T09:57:46Z

73_ImageUploadBenchmark/main.cpp

+	{
+		smart_refctd_ptr<ISemaphore> timelineSemaphore = m_device->createSemaphore(0);
+
+		auto commandPools = new smart_refctd_ptr<IGPUCommandPool>[framesInFlight];


ummm, just use the stack with MAX_FRAMES_IN_FLIGHT?
or again, nobody is going to blame you if you want to use a vector :D

Erfan-Ahmadi · 2025-12-25T10:00:30Z

73_ImageUploadBenchmark/main.cpp

+		delete[] cpuSourceData;
+		for (uint32_t i = 0; i < framesInFlight; i++)
+			delete[] regionsPerFrame[i];
+		delete[] regionsPerFrame;


I usually cringe when I see raw new/delete in the code. (PTSD from seeing legacy code)
depending on the situation just use:

stack allocation type data[MAX_SIZE]

vector for more variation and dyanmically sized stuff

smart_refctd_ptr for nabla refcounted stuff (inherited from IRefCounted)

unique_ptr for custom non nabla stuff
and take it easy. it's just an example/benchmark

CrabExtra added 2 commits December 23, 2025 18:10

Add 73_ImageUploadBenchmark example

6635ba9

Simple benchmark HOST_VISIBLE vs HOST_VISIBLE & DEVICE_LOCAL

951e2fd

Erfan-Ahmadi reviewed Dec 24, 2025

View reviewed changes

73_ImageUploadBenchmark/main.cpp Outdated Show resolved Hide resolved

Measurment was wierd, added some detail and also fix a bug related to…

141295b

… FIF

Erfan-Ahmadi reviewed Dec 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Image upload benchamark #238

Image upload benchamark #238

Uh oh!

CrabExtra commented Dec 24, 2025

Uh oh!

Uh oh!

Erfan-Ahmadi Dec 25, 2025

Uh oh!

Erfan-Ahmadi Dec 25, 2025

Uh oh!

Erfan-Ahmadi Dec 25, 2025

Uh oh!

Erfan-Ahmadi Dec 25, 2025

Uh oh!

Erfan-Ahmadi Dec 25, 2025 •

edited

Loading

Uh oh!

Erfan-Ahmadi Dec 25, 2025

Uh oh!

Erfan-Ahmadi Dec 25, 2025

Uh oh!

Erfan-Ahmadi Dec 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		uint64_t timelineValue = 0;

		commandBuffers[0]->begin(IGPUCommandBuffer::USAGE::ONE_TIME_SUBMIT_BIT);

Image upload benchamark #238

Are you sure you want to change the base?

Image upload benchamark #238

Uh oh!

Conversation

CrabExtra commented Dec 24, 2025

Uh oh!

Uh oh!

Erfan-Ahmadi Dec 25, 2025

Choose a reason for hiding this comment

Uh oh!

Erfan-Ahmadi Dec 25, 2025

Choose a reason for hiding this comment

Uh oh!

Erfan-Ahmadi Dec 25, 2025

Choose a reason for hiding this comment

Uh oh!

Erfan-Ahmadi Dec 25, 2025

Choose a reason for hiding this comment

Uh oh!

Erfan-Ahmadi Dec 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Erfan-Ahmadi Dec 25, 2025

Choose a reason for hiding this comment

Uh oh!

Erfan-Ahmadi Dec 25, 2025

Choose a reason for hiding this comment

Uh oh!

Erfan-Ahmadi Dec 25, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Erfan-Ahmadi Dec 25, 2025 •

edited

Loading