Skip to content

Commit 0340760

Browse files
committed
Added splines for camera movement, fixed threading conflict with write_frame(), more accurate runtime estimation, enabled FP16S by default
1 parent 892d5ef commit 0340760

File tree

12 files changed

+155
-54
lines changed

12 files changed

+155
-54
lines changed

DOCUMENTATION.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -313,6 +313,39 @@
313313
}
314314
```
315315
- To find suitable camera placement, run the simulation at low resolution in [`INTERACTIVE_GRAPHICS`](src/defines.hpp) mode, rotate/move the camera to the desired position, click the <kbd>Mouse</kbd> to disable mouse rotation, and press <kbd>G</kbd> to print the current camera settings as a copy-paste command in the console. <kbd>Alt</kbd>+<kbd>Tab</kbd> to the console and copy the camera placement command by selecting it with the mouse and right-clicking, then paste it into the [`main_setup()`](src/setup.cpp) function.
316+
- To fly the camera along a smooth path through a list of provided keyframe camera placements, use `catmull_rom` splines:
317+
```c
318+
while(lbm.get_t()<=lbm_T) { // main simulation loop
319+
if(lbm.graphics.next_frame(lbm_T, 30.0f)) {
320+
const float t = (float)lbm.get_t()/(float)lbm_T;
321+
vector<float3> camera_positions = {
322+
float3(-0.282220f*(float)Nx, 0.529221f*(float)Ny, 0.304399f*(float)Nz),
323+
float3( 0.806921f*(float)Nx, 0.239912f*(float)Ny, 0.436880f*(float)Nz),
324+
float3( 1.129724f*(float)Nx, -0.130721f*(float)Ny, 0.352759f*(float)Nz),
325+
float3( 0.595601f*(float)Nx, -0.504690f*(float)Ny, 0.203096f*(float)Nz),
326+
float3(-0.056776f*(float)Nx, -0.591919f*(float)Ny, -0.416467f*(float)Nz)
327+
};
328+
vector<float> camera_rx = {
329+
116.0f,
330+
25.4f,
331+
-10.6f,
332+
-45.6f,
333+
-94.6f
334+
};
335+
vector<float> camera_ry = {
336+
26.0f,
337+
33.3f,
338+
20.3f,
339+
25.3f,
340+
-16.7f
341+
};
342+
const float camera_fov = 90.0f;
343+
lbm.graphics.set_camera_free(catmull_rom(camera_positions, t), catmull_rom(camera_rx, t), catmull_rom(camera_ry, t), camera_fov);
344+
lbm.graphics.write_frame(get_exe_path()+"export/");
345+
}
346+
lbm.run(1u, lbm_T);
347+
}
348+
```
316349
- The visualization mode(s) can be specified as `lbm.graphics.visualization_modes` with the [`VIS_...`](src/defines.hpp) macros. You can also set the `lbm.graphics.slice_mode` (`0`=no slice, `1`=x, `2`=y, `3`=z, `4`=xz, `5`=xyz, `6`=yz, `7`=xy) and reposition the slices with `lbm.graphics.slice_x`/`lbm.graphics.slice_y`/`lbm.graphics.slice_z`.
317350
- Exported frames will automatically be assigned the current simulation time step in their name, in the format `bin/export/image-123456789.png`.
318351
- To convert the rendered `.png` images to video, use [FFmpeg](https://ffmpeg.org/):

README.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -182,6 +182,17 @@ The fastest and most memory efficient lattice Boltzmann CFD software, running on
182182
- fixed minor graphical artifacts in `raytrace_phi()`
183183
- fixed minor graphical artifacts in `ray_grid_traverse_sum()`
184184
- fixed wrong printed time step count on raindrop sample setup
185+
- [v2.19](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.19) (07.09.2024) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.18...v2.19) (camera splines)
186+
- the camera can now fly along a smooth path through a list of provided keyframe camera placements, [using Catmull-Rom splines](https://github.com/ProjectPhysX/FluidX3D/blob/master/DOCUMENTATION.md#video-rendering)
187+
- more accurate remaining runtime estimation that includes time spent on rendering
188+
- enabled FP16S memory compression by default
189+
- printed camera placement using key <kbd>G</kbd> is now formatted for easier copy/paste
190+
- added benchmark chart in Readme using mermaid gantt chart
191+
- placed memory allocation info during simulation startup at better location
192+
- fixed threading conflict between `INTERACTIVE_GRAPHICS` and `lbm.graphics.write_frame();`
193+
- fixed maximum buffer allocation size limit for AMD GPUs and in Intel CPU Runtime for OpenCL
194+
- fixed wrong `Re<Re_max` info printout for 2D simulations
195+
- minor fix in `bandwidth_bytes_per_cell_device()`
185196

186197
</details>
187198

src/defines.hpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,8 @@
1010
#define SRT // choose single-relaxation-time LBM collision operator; (default)
1111
//#define TRT // choose two-relaxation-time LBM collision operator
1212

13-
//#define FP16S // compress LBM DDFs to range-shifted IEEE-754 FP16; number conversion is done in hardware; all arithmetic is still done in FP32
14-
//#define FP16C // compress LBM DDFs to more accurate custom FP16C format; number conversion is emulated in software; all arithmetic is still done in FP32
13+
#define FP16S // optional for 2x speedup and 2x VRAM footprint reduction: compress LBM DDFs to range-shifted IEEE-754 FP16; number conversion is done in hardware; all arithmetic is still done in FP32
14+
//#define FP16C // optional for 2x speedup and 2x VRAM footprint reduction: compress LBM DDFs to more accurate custom FP16C format; number conversion is emulated in software; all arithmetic is still done in FP32
1515

1616
#define BENCHMARK // disable all extensions and setups and run benchmark setup instead
1717

src/graphics.cpp

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -544,9 +544,11 @@ INT WINAPI WinMain(_In_ HINSTANCE hInstance, _In_opt_ HINSTANCE, _In_ PSTR, _In_
544544
DispatchMessage(&msg);
545545
}
546546
// main loop ################################################################
547+
camera.rendring_frame.lock(); // block rendering for other threads until finished
547548
camera.update_state(fmax(1.0/(double)camera.fps_limit, frametime));
548549
main_graphics();
549550
update_frame(frametime);
551+
camera.rendring_frame.unlock();
550552
frametime = clock.stop();
551553
sleep(1.0/(double)camera.fps_limit-frametime);
552554
clock.start();
@@ -723,9 +725,11 @@ int main(int argc, char* argv[]) {
723725
double frametime = 1.0;
724726
while(running) {
725727
// main loop ################################################################
728+
camera.rendring_frame.lock(); // block rendering for other threads until finished
726729
camera.update_state(fmax(1.0/(double)camera.fps_limit, frametime));
727730
main_graphics();
728731
update_frame(frametime);
732+
camera.rendring_frame.unlock();
729733
frametime = clock.stop();
730734
sleep(1.0/(double)camera.fps_limit-frametime);
731735
clock.start();
@@ -780,9 +784,11 @@ int main(int argc, char* argv[]) {
780784
get_console_font_size(fontwidth, fontheight);
781785
while(running) {
782786
// main loop ################################################################
787+
camera.rendring_frame.lock(); // block rendering for other threads until finished
783788
camera.update_state(fmax(1.0/(double)camera.fps_limit, frametime));
784789
main_graphics();
785790
update_frame(frametime);
791+
camera.rendring_frame.unlock();
786792
frametime = clock.stop();
787793
sleep(1.0/(double)camera.fps_limit-frametime);
788794
clock.start();

src/graphics.hpp

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
#include "defines.hpp"
99
#include "utilities.hpp"
1010
#include <atomic>
11+
#include <mutex>
1112

1213
extern vector<string> main_arguments; // console arguments
1314
extern std::atomic_bool running;
@@ -34,8 +35,11 @@ class Camera {
3435
bool vr=false, tv=false; // virtual reality mode (enables stereoscopic rendering), VR TV mode
3536
float eye_distance = 8.0f; // distance between cameras in VR mode
3637
bool autorotation = false; // autorotation
37-
bool key_update = true; // a key variable has been updated
3838
bool lockmouse = false; // mouse movement won't change camera view when this is true
39+
std::atomic_bool key_update = true; // a key variable has been updated
40+
std::atomic_bool allow_rendering = false; // allows interactive redering if true
41+
std::atomic_bool allow_labeling = true; // allows drawing label if true
42+
std::mutex rendring_frame; // a frame for interactive graphics is currently rendered
3943

4044
private:
4145
float log_zoom=4.0f*log(zoom), target_log_zoom=log_zoom;

src/info.cpp

Lines changed: 32 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -3,28 +3,11 @@
33

44
Info info;
55

6-
void Info::initialize(LBM* lbm) {
7-
this->lbm = lbm;
8-
#if defined(SRT)
9-
collision = "SRT";
10-
#elif defined(TRT)
11-
collision = "TRT";
12-
#endif // TRT
13-
#if defined(FP16S)
14-
collision += " (FP32/FP16S)";
15-
#elif defined(FP16C)
16-
collision += " (FP32/FP16C)";
17-
#else // FP32
18-
collision += " (FP32/FP32)";
19-
#endif // FP32
20-
cpu_mem_required = (uint)(lbm->get_N()*(ulong)bytes_per_cell_host()/1048576ull); // reset to get valid values for consecutive simulations
21-
gpu_mem_required = lbm->lbm_domain[0]->get_device().info.memory_used;
22-
}
236
void Info::append(const ulong steps, const ulong total_steps, const ulong t) {
247
if(total_steps==max_ulong) { // total_steps is not provided/used
258
this->steps = steps; // has to be executed before info.print_initialize()
269
this->steps_last = t; // reset last step count if multiple run() commands are executed consecutively
27-
this->runtime_lbm_last = runtime_lbm; // reset last runtime if multiple run() commands are executed consecutively
10+
this->runtime_total_last = this->runtime_total; // reset last runtime if multiple run() commands are executed consecutively
2811
this->runtime_total = clock.stop();
2912
} else { // total_steps has been specified
3013
this->steps = total_steps; // has to be executed before info.print_initialize()
@@ -37,7 +20,8 @@ void Info::update(const double dt) {
3720
this->runtime_total = clock.stop();
3821
}
3922
double Info::time() const { // returns either elapsed time or remaining time
40-
return steps==max_ulong ? runtime_lbm : ((double)steps/(double)(lbm->get_t()-steps_last)-1.0)*(runtime_lbm-runtime_lbm_last); // time estimation on average so far
23+
if(lbm==nullptr) return 0.0;
24+
return steps==max_ulong ? runtime_total : ((double)steps/(double)(lbm->get_t()-steps_last)-1.0)*(runtime_total-runtime_total_last); // time estimation on average so far
4125
//return steps==max_ulong ? runtime_lbm : ((double)steps-(double)(lbm->get_t()-steps_last))*runtime_lbm_timestep_smooth; // instantaneous time estimation
4226
}
4327
void Info::print_logo() const {
@@ -58,11 +42,27 @@ void Info::print_logo() const {
5842
print("| "); print("\\ \\ / /", c); print(" |\n");
5943
print("| "); print("\\ ' /", c); print(" |\n");
6044
print("| "); print("\\ /", c); print(" |\n");
61-
print("| "); print("\\ /", c); print(" FluidX3D Version 2.18 |\n");
45+
print("| "); print("\\ /", c); print(" FluidX3D Version 2.19 |\n");
6246
print("| "); print( "'", c); print(" Copyright (c) Dr. Moritz Lehmann |\n");
6347
print("|-----------------------------------------------------------------------------|\n");
6448
}
65-
void Info::print_initialize() {
49+
void Info::print_initialize(LBM* lbm) {
50+
info.allow_printing.lock(); // disable print_update() until print_initialize() has finished
51+
this->lbm = lbm;
52+
#if defined(SRT)
53+
collision = "SRT";
54+
#elif defined(TRT)
55+
collision = "TRT";
56+
#endif // TRT
57+
#if defined(FP16S)
58+
collision += " (FP32/FP16S)";
59+
#elif defined(FP16C)
60+
collision += " (FP32/FP16C)";
61+
#else // FP32
62+
collision += " (FP32/FP32)";
63+
#endif // FP32
64+
cpu_mem_required = (uint)(lbm->get_N()*(ulong)bytes_per_cell_host()/1048576ull); // reset to get valid values for consecutive simulations
65+
gpu_mem_required = lbm->lbm_domain[0]->get_device().info.memory_used;
6666
const float Re = lbm->get_Re_max();
6767
println("|-----------------.-----------------------------------------------------------|");
6868
println("| Grid Resolution | "+alignr(57u, to_string(lbm->get_Nx())+" x "+to_string(lbm->get_Ny())+" x "+to_string(lbm->get_Nz())+" = "+to_string(lbm->get_N()))+" |");
@@ -91,10 +91,12 @@ void Info::print_initialize() {
9191
println("'-----------------'-----------------------------------------------------------'");
9292
#endif // INTERACTIVE_GRAPHICS_ASCII
9393
clock.start();
94-
allow_rendering = true;
94+
info.allow_printing.unlock();
9595
}
9696
void Info::print_update() const {
97-
if(allow_rendering) reprint(
97+
if(lbm==nullptr) return;
98+
info.allow_printing.lock();
99+
reprint(
98100
"|"+alignr(8, to_uint((double)lbm->get_N()*1E-6/runtime_lbm_timestep_smooth))+" |"+ // MLUPs
99101
alignr(7, to_uint((double)lbm->get_N()*(double)bandwidth_bytes_per_cell_device()*1E-9/runtime_lbm_timestep_smooth))+" GB/s |"+ // memory bandwidth
100102
alignr(10, to_uint(1.0/runtime_lbm_timestep_smooth))+" | "+ // steps/s
@@ -103,16 +105,17 @@ void Info::print_update() const {
103105
);
104106
#ifdef GRAPHICS
105107
if(key_G) { // print camera settings
106-
const string camera_position = "float3("+to_string(camera.pos.x/(float)lbm->get_Nx(), 6u)+"f*(float)Nx, "+to_string(camera.pos.y/(float)lbm->get_Ny(), 6u)+"f*(float)Ny, "+to_string(camera.pos.z/(float)lbm->get_Nz(), 6u)+"f*(float)Nz)";
107-
const string camera_rx_ry_fov = to_string(degrees(camera.rx)-90.0, 1u)+"f, "+to_string(180.0-degrees(camera.ry), 1u)+"f, "+to_string(camera.fov, 1u)+"f";
108-
const string camera_zoom = to_string(camera.zoom*(float)fmax(fmax(lbm->get_Nx(), lbm->get_Ny()), lbm->get_Nz())/(float)min(camera.width, camera.height), 6u)+"f";
109-
if(camera.free) print_info("lbm.graphics.set_camera_free("+camera_position+", "+camera_rx_ry_fov+");");
110-
else print_info("lbm.graphics.set_camera_centered("+camera_rx_ry_fov+", "+camera_zoom+");");
108+
const string camera_position = "float3("+alignr(9u, to_string(camera.pos.x/(float)lbm->get_Nx(), 6u))+"f*(float)Nx, "+alignr(9u, to_string(camera.pos.y/(float)lbm->get_Ny(), 6u))+"f*(float)Ny, "+alignr(9u, to_string(camera.pos.z/(float)lbm->get_Nz(), 6u))+"f*(float)Nz)";
109+
const string camera_rx_ry_fov = alignr(6u, to_string(degrees(camera.rx)-90.0, 1u))+"f, "+alignr(5u, to_string(180.0-degrees(camera.ry), 1u))+"f, "+alignr(5u, to_string(camera.fov, 1u))+"f";
110+
const string camera_zoom = alignr(8u, to_string(camera.zoom*(float)fmax(fmax(lbm->get_Nx(), lbm->get_Ny()), lbm->get_Nz())/(float)min(camera.width, camera.height), 6u))+"f";
111+
if(camera.free) println("\rlbm.graphics.set_camera_free("+camera_position+", "+camera_rx_ry_fov+");");
112+
else println("\rlbm.graphics.set_camera_centered("+camera_rx_ry_fov+", "+camera_zoom+"); ");
111113
key_G = false;
112114
}
113115
#endif // GRAPHICS
116+
info.allow_printing.unlock();
114117
}
115118
void Info::print_finalize() {
116-
allow_rendering = false;
119+
lbm = nullptr;
117120
println("\n|---------'-------------'-----------'-------------------'---------------------|");
118121
}

src/info.hpp

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,23 @@
11
#pragma once
22

33
#include "utilities.hpp"
4+
#include <mutex>
45

56
class LBM;
67
struct Info { // contains redundant information for console printing
78
LBM* lbm = nullptr;
8-
bool allow_rendering = false; // allows interactive redering if true
9-
bool allow_labeling = true; // allows drawing label if true
10-
double runtime_lbm=0.0, runtime_total=0.0f; // lbm (compute) and total (compute + rendering + data evaluation) runtime
11-
double runtime_lbm_timestep_last=1.0, runtime_lbm_timestep_smooth=1.0, runtime_lbm_last=0.0; // for printing simulation info
9+
double runtime_lbm=0.0, runtime_total=0.0f, runtime_total_last=0.0; // lbm (compute) and total (compute + rendering + data evaluation) runtime
10+
double runtime_lbm_timestep_last=1.0, runtime_lbm_timestep_smooth=1.0; // for printing simulation info
1211
Clock clock; // for measuring total runtime
13-
ulong steps=max_ulong, steps_last=0ull; // runtime_lbm_last and steps_last are there if multiple run() commands are executed consecutively
12+
ulong steps=max_ulong, steps_last=0ull; // runtime_total_last and steps_last are there if multiple run() commands are executed consecutively
1413
uint cpu_mem_required=0u, gpu_mem_required=0u; // all in MB
1514
string collision = "";
16-
void initialize(LBM* lbm);
15+
std::mutex allow_printing; // to prevent threading conflicts when continuously printing updates to console
1716
void append(const ulong steps, const ulong total_steps, const ulong t);
1817
void update(const double dt);
1918
double time() const; // returns either elapsed time or remaining time
2019
void print_logo() const;
21-
void print_initialize(); // enables interactive rendering
20+
void print_initialize(LBM* lbm); // enables interactive rendering
2221
void print_update() const;
2322
void print_finalize(); // disables interactive rendering
2423
};

0 commit comments

Comments
 (0)