Explicit dynamics on HPC

  • 81 Views
  • Last Post 06 December 2019
alitabei posted this 27 November 2019

Hi,

I run mechanical models on HPC with no problem.  In the client RSM, the parameter of shared memory parallel is "thread" and the distributed parallel is "openmpi".

However, I can't run Explicit Dynamics Simulations on more than one core on the cluster. If I request more than one core, the simulations does not start. The solution output attached. Shows pre-solver steps all complete but solve does not start!  Tried with one core on the cluster and it runs totally fine (so something is not right with the distributed parallel parameter).

Do I need to change the openmpi parameter or something else is not correct? - thanks

Order By: Standard | Newest | Votes
tsiriaks posted this 02 December 2019

Ali,

Can you post the full RSM Job Report inline with text (or screenshots if it doesn't allow you to do so) ?

Thanks,

Win

alitabei posted this 02 December 2019

Hello Win, 

Please a screen shot of the part of the RSM report with errors/warnings. 

Thanks

tsiriaks posted this 02 December 2019

Ali,

please post the entire RSM Job Report including line number.

Thanks,

Win

alitabei posted this 02 December 2019

Win, 

Since it won't fit here I am putting this public link to the full RSM report. Can you open it?

Please let me know if I should provide it to you via other methods. 

Thanks

 

tsiriaks posted this 02 December 2019

Ali,

Sorry, we are not allowed to open and download it. 

How many lines in this report ? If it's a few hundreds, can you take screenshots and post them here ? Otherwise, please post a few screenshots of the very top part of the report.

Thanks,

Win

alitabei posted this 02 December 2019

Win, 

Please see them below:

tsiriaks posted this 03 December 2019

Thank you for the info.

Does this compute-j-1 have 32 cores ?

Can you try with 4 cores first ?

Also, for your static structural analysis, please see if it's also submitted to solve on the same compute node (compute-j-1) without issue.

Thanks,

Win

alitabei posted this 03 December 2019

Hello Win, 

Yes, compute-j-1 has 32 cores. My queue has two other compute nodes, each with 32 cores.  I successfully ran a structural analysis on either of the nodes (the RSM report screenshot is below). 

The Explicit dynamics on 4 cores also fails (the screen shots again are below). 

I noticed one thing: my desktop has 8 cores. When an explicit job finishes on my desktop, it says it has used Intel MPI (screen shot below), while in my RSM and for structural analysis, I am asking RSM to use OpenMPI (screen shot below). Does the Autodyn solver distribute the solution with Intel MPI only? Maybe I need to change this in my RSM?

Thanks for your helps. 

The structural job on 33 cores on the cluster:

and the explicit job on the 4 cores that again failed;

tsiriaks posted this 03 December 2019

Thanks for the info Ali.
It seems the explicit dynamics + RSM is adding switch -mpi -ibmmpi  in the submission command while your static structural job does not have this switch (you can see this from the line 'Running Solver' in each RSM Job Report) . So, your static structural analyses must be using OpenMPI correctly but your explicit dynamics analyses are using IBM MPI which encounters the issue. I will have to do a bit research on this on how to control the MPI for explicit dynamics analyses, I will get back to you.

Thanks,

Win

  • Liked by
  • alitabei
alitabei posted this 03 December 2019

 Thanks very much Win. I will ask our admins if we have the IBM MPI on our cluster and if it can be added to my compute nodes. 

Please let me know what you find about this too. Looking forward to have this issue fixed. 

best

tsiriaks posted this 03 December 2019

Hi Ali,

Can you also post the screenshot of your WB project schematic ? And screenshot of the first tab of your RSM Configuration GUI

Another thing is, can you try taking the entire solver command from the line 'Running Solver' in explicit dynamics RSM Job Report (4 or 32 cores is ok) , then manually use that command on the submit node (submit-3 something) . You will have to make sure you execute the command from the staging directory and that the admodel_0.ad input file has to exist there. If you don't have these file or folder, in the RSM Configuration GUI, you can check to keep the files in staging directory after the job is done, then just submit a job, let it fails, and use the solver command and files from there. If you have question with this, let us know.

Also, it turns out that only IBM MPI is supported for explicit dynamics analyses on Linux cluster, ref:

https://ansyshelp.ansys.com/account/secured?returnurl=/Views/Secured/corp/v195/adyn_para/adyn_para_config.html

so, your IT will have to make sure that the cluster can run jobs with IBM MPI

Thanks,

Win

alitabei posted this 03 December 2019

Hi Win, 

Please find both screenshot attached. Our admin says that IBM MPI: "It's in /opt/intel on all of the cluster machines" ; but when I tried ibmmpi or impi in the first tab of RSM GUI, I saw the error with screenshot below. 

I will try the manual command (if I understood the instructions correctly) and will let you know what happens. 

Thanks

 

 

 

tsiriaks posted this 03 December 2019

Ali, 

The 'PE' in the first RSM Configuration GUI is a file that your cluster admin manually specifies/creates for UGE/SGE scheduler, it's not where you specify MPI implementation for ANSYS solver. So, you can't just change openmpi to ibmmpi for this. Ask your cluster admin for the correct PE to be used with IBM MPI (if it's not the same 'openmpi'). These two boxes are about telling the UGE/SGE scheduler what environment to be set for the parallel solving of the job. For the ANSYS explicit solver, it's always default to IBM MPI, so you don't have to worry about how to tell the solver to use IBM MPI. 

Yes, please try manual command submission and let us know what error you get.

Thanks,

Win

alitabei posted this 05 December 2019

Hi Win,

I got a correction from our HPC admin! we do not have the IBM MPI!! We have Intel MPI.

They are short staffed now and installing the IBM MPI will happen by the end of Jan!!!

I guess there is nothing else that I can do till then? If this is the case, should I close this thread now and open another one in Jan/Feb or leave it open?

thanks

tsiriaks posted this 05 December 2019

Hi Ali,

Unfortunately, that's it since other MPI implementations are not supported by this.

You can either respond here again when IBM MPI is installed or you can create a new thread and just reference the url of this thread. This is solely up to you.

Thanks,

Win

alitabei posted this 05 December 2019

Thanks Win. Will then post questions here after we got the IBM MPI. 

tsiriaks posted this 06 December 2019

Sounds good Ali.

 

 

Close