On 2024-02-01 21:12, J. Lewis Muir wrote:
On 02/01, Zimoch Dirk wrote:
Normally I an running it as a service. I gave that simpler scenario because it
shows the critical points and is simpler to reproduce.
Our actual problem was that [snip].
Ah, OK, thanks for that explanation; makes sense.
I tested with casw:
0. caRepeater.service is running
1. start casw
2. start an ioc. casw shows the beacon anomaly
3. sudo systemctl restart caRepeater.service
4. start an ioc. casw does not show any beacon anomalies any more
5. restart casw. It works again.
Unfortunately, casw (or any ca client) cannot find out that the caRepeater it
had registered to has died. Thus it never tries to reconnect.
Ouch. That seems like a major problem to me. It seems like that means
that to upgrade caRepeater, you have to restart all CA clients as well,
which would include IOCs that are CA clients. If you don't do that, the
CA clients (including IOCs that are CA clients, for example, via a CA
link) will stop working correctly. Is that right? If so, that's rough.
Well, I somewhat wonder, why people update the repeater ?
The code that is used inside the repeater has been stable for years,
at least, there are no day-by-day improvements.
We can say either do not update the reapeater every night, only
when things have really changed, there was a real problem that
now had been solved.
What is the repeater good for ?
To forwared the beacons from the different IOCs (on different hosts) to
the different clients (all on the same host as the repeater).
Right now it seems that "a missing beacon" (via UDP) from one IOC
makes the client (camonitor in my case) going out with a camessage (?)
via TCP. So things do continue to work.
Depending on the number of clients, any gateways, the network load,
CPU capacity, this may be a working solution.
But that is my limited understanding.
I don't know hardly anything about the CA protocol, so what I'm about
to say may not be possible or may not even make sense, but I wonder
if caRepeater could be changed to send some kind of CA message to all
registered clients when it's about to exit? That wouldn't work for
the case of caRepeater being sent a SIGKILL or SIGSTOP signal (or the
equivalent on Windows), nor the case of caRepeater crashing, but it
would work for the case of signals that can be caught. Still, such a
solution doesn't seem particularly robust since it wouldn't work if the
CA message didn't get delivered to all clients for whatever reason.
Does the repeater ever exit ? Normally not, unless it is terminated
by a signal.
I wonder if the CA protocol could be extended to support some kind of
mechanism to allow clients to detect when the caRepeater has died,
stopped working, or restarted? For example, maybe CA clients could
periodically poll for a unique caRepeater ID that would change when a
new caRepeater process is started?
There is, may be, no need to extend the protocol.
The client(s) will realize, the beacons are missing.
And that can mean a lot of things:
IOC down.
Network down.
repeater down.
It could be possible to fiddle in a "repeater, are you alive" thing
into the code. How much sense that makes, I don't know.
uff.
Having used TCP instead of UDP to connect to the caRepeater would not have this
problem, I think.
Patches welcome, or is this too harsh ? just trying to be helpful
Interesting.
Lewis
- References:
- caRepeater question Zimoch Dirk via Core-talk
- Re: caRepeater question Torsten Bögershausen via Core-talk
- Re: caRepeater question Zimoch Dirk via Core-talk
- Re: Re: caRepeater question J. Lewis Muir via Core-talk
- Re: Re: caRepeater question Zimoch Dirk via Core-talk
- Re: caRepeater question J. Lewis Muir via Core-talk
- Navigate by Date:
- Prev:
Build completed: epics-base base-socket_accept_type-54 AppVeyor via Core-talk
- Next:
Build failed: EPICS Base 7 base-7.0-1090 AppVeyor via Core-talk
- Index:
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
<2024>
- Navigate by Thread:
- Prev:
Re: caRepeater question J. Lewis Muir via Core-talk
- Next:
Build failed: epics-base base-7.0-53 AppVeyor via Core-talk
- Index:
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
<2024>
|