HFSS HPC remote solve issue

  • 200 Views
  • Last Post 30 April 2019
  • Topic Is Solved
tphelps posted this 16 April 2019

We are noticing an issue where we are unable to simulate remotely if we use the automatic settings. I am using Ansys Electromagnetics Suite 19.2.0:   image.png   I specified 24 cores in my analysis configuration in HPC and Analysis Options:   image.png

  The adaptive solution process works fine, but frequency sweeps throw an error saying that the HFSS solver failed to start:   image.png  

If, however, I uncheck the 'Use Automatic Settings' box in the analysis configuration, frequency sweeps run with no errors. For example, the following configuration works fine:

image.png  

Can you help us understand this error? Are we are losing performance by not using automatic settings in the analysis configuration, and does it really make a difference if we specify Total Enabled Tasks and Total Enabled Cores

Order By: Standard | Newest | Votes
tsiriaks posted this 17 April 2019

Could you re-post the images ?

When the job fails, open up your 'profile' and scroll to the bottom of it , and post the screenshot of it here.

How many tasks are you specifying when you uncheck the Use Automatic Settings ?

The performance question is tough to answer as it's a case-by-case basis but generally we always recommend to use the Use Automatic Settings (if there is no issue with it).

Thanks,

Win

tphelps posted this 17 April 2019

Hello Win,

Thank you for the response. If we use automatic settings we have found that we can only use one task on a single core. With previous versions of HFSS (v17) we were able to run multiple tasks at once (i.e. solve multiple frequencies in a sweep at the same time). We want to try get the most out of our sever again. 

We will do the test you asked for and post it asap. Could you please clarify what you mean by "open the profile"? 

Here are the reposted images.

Image 1:

image.png

Image 2:

image.png

Image 3:

image.png

Image 4:

image.png

tphelps posted this 17 April 2019

Here is the profile:

tphelps posted this 17 April 2019

Here is the error that came with the profile:

tsiriaks posted this 17 April 2019

Somehow your first 4 images are still not posted correctly. Could you capture them again and re-post ?

If it works only with 1 task, try changing the MPI setting in HPC and Analysis Options to use Intel . Does this help ?

Also, I would recommend to try 2019 R1

Thanks,

Win

tphelps posted this 17 April 2019

Here are the photos again, I can see them on this post with multiple computers so I am not sure what is happening:

 

Changing to intel does not help, we see the same error. What reasons are there for the HFSS solver to fail to start?

We will try with the updated HFSS, but what are some other solutions we could try.

Thanks.

tphelps posted this 18 April 2019

Hello,

We are having the same issues with 2019 R1. Here is the profile:

 

This profile was done with the automatic settings enabled.

tsiriaks posted this 18 April 2019

Thanks, I can see the images now.

It seems like the auto setting is trying to solve on 9 tasks.

Also, when you uncheck the auto setting, you are actually using 12 tasks ? I might have misunderstood your statement here

"we can only use one task on a single core" 

so this 12 tasks + 24 cores setting work fine ?

I'm not sure why auto setting is trying to solve on 9 tasks while you are specifying 24 cores. It's not balanced.

On the remote/solve machine, open CMD Prompt then issue the following command

WMIC CPU Get DeviceID,NumberOfCores,NumberOfLogicalProcessors

what do you get ?

Thanks,

Win

 

tphelps posted this 18 April 2019

Hello Win,

 

So the test in the last post actually did not run. It failed with the error:

"The HFSS solver failed to start, please contact support"

It was a 9 point frequency sweep and that is why you see only 9 tasks. 

We have found that unless we do one task at a time (in automatic settings this means only allowing one core, in manual settings this means not selecting any options in the job distribution tab) we will see this error. We are trying to figure out why this is.

We will run your test now.

tphelps posted this 18 April 2019

tsiriaks posted this 18 April 2019

Ah I see.

What if you use the 'Auto Settings' with 9 cores or 18 cores ?

tphelps posted this 18 April 2019

We see the same result with both of these core settings. 

tsiriaks posted this 18 April 2019

Have you tried this with Intel MPI ?

Let's uncheck automatic setting for now

and use

2 tasks 10 cores + Intel MPI

tphelps posted this 19 April 2019

With nothing checked in job distribution? Does this still try and use distributed computing at all?

We will run this test, but I got the impression that unless we checked some options here it would just do a single task at a time.

tphelps posted this 19 April 2019

We found the following:

-With nothing checked it seems to run as if we had the automatic settings on and one task available.

-With only domain solver checked we see no error, and it seems a bit faster.

-With frequencies checked we see the following error

tsiriaks posted this 19 April 2019

I will have to ask others to chime in. I'm not exactly familiar with solving with unchecking auto setting

In the meantime, what if you specify username/password (that's the login for the remote machine) in Tools > options > general options > remote analysis, in the ANSYS EM GUI on your local machine  ?

tphelps posted this 19 April 2019

Yes please another voice would be good.

 

We will try this but I don't think we are having any problems authenticating. Maybe we could approach the issue from the server side? What are the things we should do on the server side to make sure it can run correctly and fully utilize its cores?

 

tphelps posted this 19 April 2019

New info:

 

We tried running a simulation directly on the server instead of remotely from a workstation and found we solve frequencies in parallel. However the same settings used on the remote workstation did not work. What are some issues that could cause that?

 

Thanks

tsiriaks posted this 22 April 2019

Did you try specifying username/password as I described above to run the processes as regular user instead of service user ?

Also, try turning off Firewall and Antivirus software on the remove solving machine.

tphelps posted this 22 April 2019

Hello! 

 

Just following up. Can you think of anything we could try?

 

Thanks

jcallery posted this 22 April 2019

Hi tphelps,

Were you able to try the suggestions Win has mentioned already, regarding eh Firewall, AV, and which user to run the service as?

If so what were the results?

Thank you,

Jake

tphelps posted this 23 April 2019

Hello All,

 

Apologies, for some reason your forum posts did not show up for me until now. We have tried the following:

-VPN to the same network (campus network). This did not work at all. could not even locate the server

-We tried simulate with all firewalls off, no change.

-We tried running the suggested settings to run as a different user, this also resulted in the same error as before.

Thanks

tsiriaks posted this 24 April 2019

Let us contact you directly about this issue.

 

tphelps posted this 24 April 2019

That would be great! How would you like to go about it?

tsiriaks posted this 30 April 2019

Note: This issue is resolved by using hostname instead of IP address in the HPC and Analysis Options when specifying the remote solve machine

Close