Fluent HPC MPI issue, can only use a certain number of cores

  • 150 Views
  • Last Post 30 October 2019
zbharris posted this 22 October 2019

Hello,

I am a university student, and have been using Fluent for a while on a desktop, using up to 12 threads (processes) without issue. I recently started trying to use an older server for HPC, and am running into problems. The server is running Windows Server 2012 R2, has four AMD Opteron 6380 processors for total 32 hyperthreaded cores, and 256 GB RAM. Ansys 18.2 is installed, and I am using Workbench for a fuel injection case. The system works fine, and seems to scale properly, with up to 9 processes, and Fluent appears to see the system as having a single 64 thread CPU. The problem comes with 10 or more processes.

With the default settings, which appears to be using the ibmmpi MPI type, Fluent crashes after getting to the point where it says "...spawning node 0 on machine..." with a Windows error saying mpirun.exe has stopped working, and once Fluent is closed, Workbench gives the error "The FLUENT application failed to validate the connection." For the sake of trial and error, I changed the MPI type to msmpi, which gives a slightly different error in Fluent, saying that it is unable to find the file for the msmpi type. Workbench gives the same error as before. When using the intel MPI type, on the other hand, Fluent loads with 32 processes without apparent issue, and even appears to run correctly, crunching through iterations much faster than with 9 processes, but invariably, with this setup, the residuals jump after a number of iterations and Fluent gives the "Divergence detected in AMG for pressure coupled, protective actions enabled!" and "...temporarily solve with BCGSTAB!" warnings.

If the error occurred with more than 8, or with more than 16 processes, I would suspect that it has to do with the fact that the server has 4 CPUs, but the fact that it occurs with more than 9 is confusing. Please advise, and thank you for your help.

Order By: Standard | Newest | Votes
abenhadj posted this 23 October 2019

Switch off Hyperthreading and test with most actual release. 

What it is for a case? 

Does the issue occur on other machines?

Best regards, Amine

zbharris posted this 29 October 2019

 The case is a steady-state, supercritical (so single phase), axisymmetric fuel injection. Very simple case, no issues with as many as 12 processes on other machines (12 is the maximum number of threads for said machines).

We have tried to switch off hyperthreading, but there is no option to do so in the bios. In the bios, the CPUs are displayed as each having 16 cores, but Windows sees 32 hyperthreaded cores. Setting each CPU to 8 cores in bios causes Windows to see 4 cores and 8 threads per CPU, and Fluent still crashes with 10 or more processes, but is fine with 9. The CPUs use the AMD Bulldozer architecture, where the difference between threads and cores is muddier than it is with Intel because each hyperthreaded thread gets its own L1 cache with Bulldozer, and we suspect this is why hyperthreading cannot be disabled.

abenhadj posted this 30 October 2019

Check with IT if there us way to switch it off. Meanwhile try updating Fluent and use Intel MPI.

Best regards, Amine

Close