Forum: PogamutUT2004

Bot state change to dead freezes code execution.

Hi all,

I am having this problem, my bot on average dies in each run of the experimental set up but only on few rare occasions I have noticed that after switching to dead state the whole pogamut freezes. I get the following output and the code does not budge any further: (the "hey bot was killed" is just simple debug to say that i am in botKilled method)

Bot switched to DEAD STATE.
HEY BOT WAS KILLED ....
(ThiefBot) WARNING 21:16:29.780 FSMBotDeadState: unprocessed message InfoMessageZoneChangedBot | Id = CTF-1on1-Joust.ZoneInfo0 |

Please could you advise me on this one. Did anyone in their experiment come accross this, surely I am not the first one :-) Can I write some code to prevent the pogamut from freezing as this simply stops my app/expriments and I have to begin from the very start. Maybe I can wrap it up in some exception and simply ignore it etc...

Thanks,

P.
Hi!

This is truly a mystery - I've checked the code and it should not happen.

Do you spawn your bots automatically? I mean, did you changed the bot spawning mode to manual by any chance?
Whenever the bot dies it respawns itself automatically (unless you send Configuration message with ManualSpawn set to true). We've never experience such problem before.

I've also checked that the WARNING message does not pose any problem - it simply reminds you, that some message from GB2004 has been discarded as the bot is dead (it does not throw any exception that would kill the Pogamut).

Are you sure that there is no "busy waiting" in your code that would pause the whole execution (processing new messages)?

Also, I would suggest a simple experiment: create a new project from samples (EmptyBot) run it and kill it (in the game) ... see what happens. Does the bot respawns ok?

Cheers!
Jimmy
Another idea - could it be caused by some unhandled exception? Do you call some methods on your bots from different than the bot thread? If yes, envelop each of this call by try catch block (I've recently had a problem that I was calling some methods from different threads on the bots, it caused null pointer exception, which was NOT propagated to standart output and everything then entered a pretty weird state - similar to what you are describing. ) "Try catch" block might help. Also add some print logs, so you know the state in your code that actually precedes the stopping. If everything fails, try to debug the code with breakpoints and step over.

Best,
Michal
In response to Jakub:

Yeah I know that it is rather strange, previously when I was only running the bot for one scenario it was always killed, and switched to the dead state but did not hang up things. Now the scenario is repeated numerous times in the experiment and at k time it has tendency to just hung up but I must say it occured to me twice so far though I have run far more experiments than that, that is why it is so weird.
Yes I am usually testing my code with hunter bot just to be on the safe side which was the case also in
experimental set-up topic thanks to which I managed to find the bug.

Yes I am using manual spawn but only during initialisation to set the bot at localisation of my choice, but once it dies it is automatically resapwned and stopped by me issuing this.bot.stop message. In my experiments I always execute the same sequence of primitive actions which sooner or later end in bot dying as it does not do any combat actions etc.

Finally, after the freeze happened I left the app running for about 1 hour in hope that it may be some thread waiting. However, it just waited with the two bots frozen in the middle of the map, quite amusing scene.

I shall try few more times with hunter or emtpy bot to see if this happens. Please could you explain more by what you mean "busy waiting" so I can target my code better.

In response to Michal:

I have plenty of objects that are initalised before bot starts and executed in the logic method but none of them runs in a seperate threads. Yes, I agree that something may be erroring and the message being swept under the carpet later coming to freeze the code. I will definietly put some more debugging statements today and see if this happens again.

Thanks for quick response, I will keep fighting but in the mean time if you have any more ideas guy please send them my way.

P.
Hmm, this sounds really strange.

Another question - are you running more than one bot in the JVM? If so, whenever the bot dies, does it freeze only this one bot or all of bots?

Might be some racing conditions and deadlock happening in the Pogamut ... do you have any other code that reacts on the bot's death?

Geeze, this is giving me a headache...

Jimmy
I am sorry for late reply but been busy testing code ...

In answer to your question Jakub, I am running my custom built bot and the native bot connected
by issuing the addbot command.The last time this happened bot bots had frozen in the middle of the
1on1Joust map.

Since that time I have run quite few batches of experiments and didn not get this error again. However,
you may be right in saying that there is probably some racing condition in which case replicating it
is very difficulat and can take some time to occur again. I only call the code to stop the bot once
per execution of the bot. To be precise it is called after fixed number of actions is executed. There is
no other place that issues stoping or killing commands.

A single experimental run consists of starting the ucc server, starting the bot controller and connecting the two
bots and then shutting down the whole thing in exact way as was presented by Jacob Schrum. The only place my code
differs is when I am executing my own bot as I am using the UT2004BotRunner to achieve this, plus the launching
of the native bot as mentioned earlier. Now during each run I am executing a fixed sequence of actions and regardles
whether my bot dies or not I am stoping it and restarting the server. This needs to be repeated 2000 times.
During the last trials I am getting exactly the same error but different from the dead bot state. It happens
at different times, once I got it after 12 runs, then after 47. What is common, is that it always happens after the
server has been initialised and during my bot launching. Here is the error I am getting:

Exception in thread "main" ComponentCantStartException[UT2004BotThiefBot: Can't start: BusAwareCountDownLatch: Interrupted because bus was stopped (fatal error, or watched component stopped) while waiting on the latch. (caused by: BusAwareCountDownLatch: Interrupted because bus was stopped (fatal error, or watched component stopped) while waiting on the latch.)]
at cz.cuni.amis.pogamut.base.agent.impl.AbstractAgent.start(AbstractAgent.java:408)
at cz.cuni.amis.pogamut.base.agent.utils.runner.impl.RemoteAgentRunner.startAgent(RemoteAgentRunner.java:70)
at servers.tbcs.ThiefBotControlServer.runServerOnceForParameters(ThiefBotControlServer.java:213)
at servers.tbcs.ThiefBotServerTask.main(ThiefBotServerTask.java:61)
Caused by: BusStoppedInterruptedExceptionBusAwareCountDownLatch: Interrupted because bus was stopped (fatal error, or watched component stopped) while waiting on the latch.
at cz.cuni.amis.pogamut.base.component.bus.event.BusAwareCountDownLatch.checkBusStop(BusAwareCountDownLatch.java:173)
at cz.cuni.amis.pogamut.base.component.bus.event.BusAwareCountDownLatch.await(BusAwareCountDownLatch.java:167)
at cz.cuni.amis.pogamut.ut2004.bot.impl.UT2004Bot.startAgent(UT2004Bot.java:174)
at cz.cuni.amis.pogamut.base.agent.impl.AbstractAgent.start(AbstractAgent.java:394)



And the exact place where this happens in my code is:

UT2004BotFactory factory =
new UT2004BotFactory(new UT2004BotModule(ThiefBot.class));

UT2004BotRunner botRunner =
new UT2004BotRunner(factory, Constants.BOT_NAME) {


@Override
protected void preStartHook(UT2004Bot agent) throws PogamutException {
super.preStartHook(agent);
((ThiefBot) agent.getController()).setBotParameters(params);
}

};
// Connect native bot to the ucc server.
server.getAct().act(nativeAgent);
IAgent agent = botRunner.startAgent(); // THIS IS THE LINE THAT ERRORS (SOMETIMES)!!!!


// Wait until the bot finishes and close Pogamut platform.
new WaitForAgentStateChange(agent.getState(), IAgentStateStopped.class).await();
agent.kill();
Pogamut.getPlatform().close();

It appears to be a problem with CountDownLatch which seems to control the starting of bots and possibly this is where
this racing condition may be occuring and later causing either freeze of the app during execution of the bot or before
it even starts. Please understand that I am in no way trying to patronise anyone but my priority is to get this fixed
so I can set these experiment running and start getting some results.

As always if you have any ideas how I can even temporarily overcome these send them my way. Once again I am really
grateful for all the hard work you doing!

Thanks,

P.
Hi!

Do not mention patronising, if something does not behave as you would expect it in Pogamut, we're here to help!

I'm currently at work, so I can't dig deep into the Pogamut, but I will try to explain what the BusAwareCountDownLatch exception is and why the latch is present in the UT2004Bot hopefully pointing you somewhere, where the original bug is, as InterruptedException alone is not a problem - it was actually triggered due to an exception somewhere else.

So, whenever UT2004Bot agent is starting (at the line IAgent agent = botRunner.startAgent();), it must wait for the handshake between bot and GB2004 (== initial communication where navigation graph, items, item classes, players, etc. are exported) so your logic start in the point where UT2004Bot is fully initialized (as we're deploying multi-threaded model, there is no other way than to use something like latches). So the UT2004Bot is starting, this means that its mediator thread (thread that is constantly parsing messages from GB2004 stuffing it into the world view) crunching the handshake. UT2004Bot.startAgent() (your thread) is waiting for this to finish. Now our bug we're pursuing steps in, that throws an exception somewhere in the Pogamut that happens to propagate fatal error event through the UT2004Bot's component bus, tearing the whole system down == interrupting the latch (as it is bus aware one) ultimately resulting in throwing the exception you've posted.

There are a few reasons that might cause that:
1) the GB2004 fails to behave as expected, exporting some strange message at certain point of handshake resulting in the exception
2) the END message was not received by the bot in 60secs (the UT2004Bot is not the only one waiting for the first batch of infos, UT2004BotLogic is doing that too)
3) one of our modules' (i.e., Players, Weaponry, Items) listeners fails to process some event, throwing an exception
4) one of your listeners fails to process some event, throwing an exception

In any way, there should be another exception logged prior the interrupted exception you've posted (generally, it is always the best to find the first exception that is logged as it usually triggers many more in the whole Pogamut system, while the agent is being torn down).

Could you see the logs again whenever it happens next time?

Hope this helps somehow....

Cheers!
Jimmy

P.S.: the CountDownLatch is not used whenever the bot dies, so there can't be racing conditions due to this latch, this latch is used only during start up of the UT2004Bot
Morning,

Ok, you're right there, there is error that comes first that triggers this 'avalanche' and in some ways I am not
suprised as it seems the famous Weaponry class and its listener fail - I had problems with this module from the beginning,
I have posted the stuff about translocator etc... Anyway, here is the snippet that may help as it significantly narrows
the possibilities (so I hope :-) )

(ThiefBot) INFO 19:06:31.515 Agent state switched to: BotStateInitedBot initialized.
(ThiefBot) INFO 19:06:31.515 In state KILLING.
(ThiefBot) SEVERE 19:06:31.515 Fatal error in WorldView2635: Exception raising event InfoMessageAddInventoryMsg | Id = WorldObjectIdCTF-1on1-Joust.AssaultRifle | Type = XWeapons.AssaultRifle | PickupType = ItemTypename = XWeapons.AssaultRiflePickup, category = WEAPON, group = ASSAULT_RIFLE |
(ThiefBot) SEVERE 19:06:31.515 Fatal error happenned - component bus is stopping.
FatalErrorEvent[
Component=UT2004SyncLockableWorldView
Message=Exception raising event InfoMessageAddInventoryMsg | Id = WorldObjectIdCTF-1on1-Joust.AssaultRifle | Type = XWeapons.AssaultRifle | PickupType = ItemTypename = XWeapons.AssaultRiflePickup, category = WEAPON, group = ASSAULT_RIFLE |
Cause=java.lang.NullPointerException
Cause stacktrace:
cz.cuni.amis.pogamut.ut2004.agent.module.sensomotoric.Weaponry$Ammunition.getPriAmmoForWeapon(Weaponry.java:577)
cz.cuni.amis.pogamut.ut2004.agent.module.sensomotoric.Weaponry$AddInventoryMsgListener.notify(Weaponry.java:812)
cz.cuni.amis.pogamut.ut2004.agent.module.sensomotoric.Weaponry$AddInventoryMsgListener.notify(Weaponry.java:803)
cz.cuni.amis.pogamut.base.communication.worldview.impl.AbstractWorldView$ListenerNotifier.notify(AbstractWorldView.java:98)
cz.cuni.amis.pogamut.base.communication.worldview.impl.AbstractWorldView$ListenerNotifier.notify(AbstractWorldView.java:77)
cz.cuni.amis.utils.listener.Listeners.notify(Listeners.java:252)
cz.cuni.amis.utils.listener.ListenersMap.notify(ListenersMap.java:76)
cz.cuni.amis.pogamut.base.communication.worldview.impl.AbstractWorldView.notifyLevelAListeners(AbstractWorldView.java:629)
cz.cuni.amis.pogamut.base.communication.worldview.impl.AbstractWorldView.innerRaiseEvent(AbstractWorldView.java:700)
cz.cuni.amis.pogamut.base.communication.worldview.impl.AbstractWorldView.raiseEvent(AbstractWorldView.java:604)
cz.cuni.amis.pogamut.base.communication.worldview.impl.EventDrivenWorldView.raiseEvent(EventDrivenWorldView.java:102)
cz.cuni.amis.pogamut.base.communication.worldview.impl.EventDrivenWorldView.innerNotify(EventDrivenWorldView.java:126)
cz.cuni.amis.pogamut.base.communication.worldview.impl.EventDrivenWorldView.notify(EventDrivenWorldView.java:223)
cz.cuni.amis.pogamut.base3d.worldview.impl.BatchAwareWorldView.notify(BatchAwareWorldView.java:83)
cz.cuni.amis.pogamut.ut2004.communication.worldview.UT2004SyncLockableWorldView.notify(UT2004SyncLockableWorldView.java:221)
cz.cuni.amis.pogamut.base.communication.mediator.impl.Mediator$Worker.run(Mediator.java:315)
java.lang.Thread.run(Thread.java:619)
(ThiefBot) SEVERE 19:06:31.515 Received fatal error from WorldView.

With my small debug statements it occured when the botInitialised method was called but then again if it is multithreaded
call then this might have happened earlier in bot prepare.

You know Jakub, could you tell me what is the best method/way to get the weaponry info and act on it, I mean
what would be your advice in terms of initialising and using weaponry? At the moment I am
passing this.weaponry in botInitialised method to another object that encapsulates game, agentinfo and other crucial info
as present in UT2004BotModuleController. This object is later passed around to other classes etc.

In the meantime I can think of at least two things i can do, one is to plug in Hunter bot instead of mine as it uses
weaponry and see if the error happens or eliminate the weaponry module from my bot and see if this helps.

I really would like to use this module as its use is many ways imperative in my experiments and I hope this can be sorted.

Thanks a lot,

P.
Sorry, just a small idea, do you think that I could use UT2004SyncLockableWorldView to make sure that before
weapons are used there are available from the world view? If yes, how can I go about it?
Good, we have a culprit :-)

I'm a bit ashamed of myself, as I'm responsible for the Weaponry class ... the problem lies with identifiers for the various ammos/weapons which are case-sensitive and they are exported differently from time to time :-(

The way, you're using the Weaponry is fine, since botInitialized method is called, the Weaponry must be fully usable (if not -> it is a bug).

I will have to ask you to be patient. I will sit down tonight and go through the Weaponry module line-by-line and shield it from all possible NullPointerExceptions
logging anything that goes wrong to the console. (I would do it right away if I was not at work...).

Apologize for being forced to use such crapy code :-(

Best,
Jimmy
UT2004SyncLockableWorldView is being used as default :-)

Jimmy
Apologies accepted :-) I strongly believe that you will manage to sort it out so it works like a charm. In the meantime while I am waiting I will focus on other bits of my project code and maybe write a page or two of my final report.

Much appreciated,

P.
Hi!

I've just committed hardened version of Weaponry class. Try it :-) ... but be warned - the Weaponry usually had a very good reason to throw an exception. Instead of exceptions, it will now log messages with Level.WARNING. Check logs from time to time to see whether the Weaponry doesn't report problems. If so, please, post them here - they will probably contain unknown item types that can be easily added to the Pogamut so it won't happen again.

Cheers!
Jimmy
That was very quick :-). Ok, I have checked out the code for both core and ut2004, I have also had to refactor few bits in my code as you added some extra generic type for params in bot runner and moved some server types up the hierarchy etc... I had quick run all seems to work so far, I mean i just run two experiments... this is obviously not enough to start getting weaponry errors. A very daft question, how do I access these logs, are they associated with bot.weaponry.getLog() etc... I would like to attach them to my custom logger so they are thrown in together with other important stuff?

I still need to fix few bits in other parts, but I will try to run couple hundred experiments today to test the system and as soon as I get some errors/logs I will let you know.

Btw, since I got the latest revision does it mean that I can now use the translocator ?

Thanks a lot,

P.
Yikes, I've forgot about Translocator ... well you may try but I think it will start to throw WARNING messages (which would tell us which ammo type is missing, so it would be good :-)).

I've committed update to AgentModule exposing the logger via getLog() method. You may use it on Weaponry class and attach your own Handler or Publisher (try to cast it to LogCategory - it will succeed and it contains nice method for adding simple ILogPublishers (easy to implement interface)).

Cheers!
Jimmy
Morning,

How could you forget, I am shattered now :-)

Ok, I have done few runs of the experiment and did not get the weaponry error. However, the last time I run quite long experiment and the app crushed terribly due to the same bot dead state error :-( - I am really
desperate to get this fixed as it seems the only thing that holds me up. Here it is:

(ThiefBot) WARNING 23:12:35.764 Bot switched to DEAD STATE.
HEY BOT WAS KILLED ....
(ThiefBot) WARNING 23:12:35.764 FSMBotDeadState: unprocessed message InfoMessageZoneChangedBot | Id = CTF-1on1-Joust.ZoneInfo0 |
BUILD STOPPED (total time: 75 minutes 50 seconds)

So far this is the biggest obastacle for me to get this experimental system finished, provided that weaponry does not hang things I would be ready to go. Please can you have look into this. Is there anything I can do
in my code to respond to this change of state and simply ignore it etc...

Thanks,

P.
So there is no exception logged in the stdout? Do you always need to stop the experiment manually?

Or it just crashes?
(This could be due to JVM Heap size, how long do you need your experiment to run? Couldn't you just increase the heap size? I.e., via java -Xms512m YourBotClass
I've already tested the Pogamut whether it leaks memory and it should not (the base UT2004 gets gc()ed regularly).
If it is JVM Heap problem - do you have some listeners hooked to UT2004Bot's worldviews from outside the UT2004? If so, these
instances can hang up in the air.)

I suppose the Pogamut will hang in the air not doing anything (so I assume that heap size is sufficient and there are racing conditions involved). For this case I will write a test case tonight that will force two bots to kill each other endlessly (let's say 1000x?). And more... Second test case - that will reinstantiate new bot after it is killed 10times.

Do you have some hints, how could I enhance these tests so it would match your setup more precisely?

Best,
Jimmy

P.S.: I also assume, there are no exceptions logged prior the faulty DEAD STATE, right?
Hi,

I terms of experimental set up I am doing it almost in similar way to Jacob's advice in "experimental setup" topic, killing and starting the ucc server and then connecting bots and again in circles.

The message with bot state switched to dead is issued in each of the runs, but it hungs only once :-)

In each run after the agents are stopped and platform is about to shutdown this warning is issued:

(ThiefBot) WARNING 22:29:50.796 In-Logic-Thread Stopping happens. This occurs whenever the LogicModule is being stopped from within its own thread. While this may proceed as you have expected, it is unsupported operation with uncertain result.
It is adviced to perform the troubling operation in different thread, e.g.:
new Thread(new Runnable() {
@Override
public void run() {
// do something that happens to stop the logic module //
}
}).start();

Do you think that I should set this.bot.stop in separate thread?

Next during stopping the ucc server the following is always issued:

(ThiefBot) WARNING 22:29:18.467 Component UT2004BotThiefBot has stopped.
(ThiefBot) WARNING 22:29:18.467 All agent's components has stopped. Stopping agent as well.
(ThiefBot) WARNING 22:29:18.467 Unregistering JMX components.
(ThiefBot) WARNING 22:29:18.561 Thread 0: stopping the thread, received ComponentNotRunningException from UT2004SyncLockableWorldView.
(ThiefBot) WARNING 22:29:18.561 Thread 0: Logic thread stopped.
(ThiefBot) SEVERE 22:29:18.561 Killing agent ThiefBot2-2@
(ThiefBot) SEVERE 22:29:18.561 Fatal error happenned - component bus is stopping.
FatalErrorEvent[
Component=UT2004BotThiefBot
Message=agent kill() requested
Stacktrace:
cz.cuni.amis.pogamut.base.component.bus.event.ComponentBusEvents.fatalError(ComponentBusEvents.java:98)
cz.cuni.amis.pogamut.base.agent.impl.AbstractAgent.kill(AbstractAgent.java:517)
servers.tbcs.ThiefBotControlServer.runServerOnceForParameters(ThiefBotControlServer.java:214)
servers.tbcs.ThiefBotServerTask.performTask(ThiefBotServerTask.java:42)
servers.ServerSynchronisationTask.runGameServer(ServerSynchronisationTask.java:42)
servers.GameBotServer.run(GameBotServer.java:22)
]

This points to the place in my code where I do something like:

IAgent agent = botRunner.startAgent();

// Wait until the bot finishes and close Pogamut platform.
new WaitForAgentStateChange(agent.getState(), IAgentStateStopped.class).await();
agent.kill(); /// THIS IS THE THIEFBOTCONTROLSERVER : 214
Pogamut.getPlatform().close();

Do you think this could be potentially the problem, stack overflowing error?

Another thing i was thinking that there may be loads of objects that are not gc'ed after each run of the bot finally
stuffing the jvm to its limit?

My experimental setup is this:
1. map CTF-1on1-Joust, no mutators
2. two bots set on the opposite sides
3. my bot runs are of fixed length of actions and after the sequence ends a stop message is issued, so each experiment roughly should have the same length of execution
4. if the bot dies during the sequence execution it does not get anymore commands, just respawns and stands still till
the sequence runs out, and then the stop is called
5. this needs to be performed let us say 2000 times to get some reasonable results

Please follow-up if anything is unclear! It really is awesome that are going to run these test. On my side I will try to
get some JVM debugging tool to see exactly when and if things get filled up.

One final thing, if you feel that this would help in writing test cases etc, I could upload my code to a repository so you
can have a look? If yes, please could you provide me with your private email...

Thank you a lot,

P.
Wow, now I've got some interesting result :-)))

The problem does not lie in Pogamut (hopefully) but in GameBots2004 settings.

Have you set GameBots2004.ini correctly for your experiment? It is usually suprising for everybody (even me!) that you have to think about
UT2004 game victory conditions. But I'm running ahead too much.

I've implemented the first test case, I've got 4 bots killing each other madly until death counter in one of them reaches 100 (if so, bots are stopped and test succeeds).
Guess what happens for the first time I've run the test? When one of the bot reached 25 frags, everything has frozen. Nobody was receiving messages but Sockets stayed opened. So the Pogamut waited endlessly what would happen... wow :-)

How to fix that? I've asked myself - well' go to the UT2004/System directory and edit GameBots2004.ini file and correctly set
GameBots2004.BotDeathMatch
TimeLimit=120
GoalScore=200

I hope you've made the same mistake as I did :-) because the second time, the test run flawlessly (bots experienced around 300 deaths all together).

Cheers!
Jimmy
I genuinely hope that this is the case, because each time I am starting and stopping the server, I am using the UCCWrapper.UCCWrapperConf class to set all the params. I guess I need to use setOptions and pass all the mutators. Apart from that I will of course modify the GameBots2004.BotCTFGame and set all the limits to some ridiculously high values. I don't want to be pessimistic but I still feel like it maybe something to do with threading not the limits as such because I am stopping the ucc after sequence of actions has occurred and there is no chance that any limits get broken... but please allow me some time as I am on different machine. I am still going to run few debugging sessions...

Thanks for this,

P.

ps: i will write tomorrow whether i was successful - fingers crossed :-)
Even though you're using UCCWrapper, it still recalls ucc.exe starting GameBots2004 server that takes its defaults from GameBots2004.ini

I think we have a good chance :-)

Good luck!

Jimmy
Hello,

Yet another day in the battleground brings very little solutions:)

I run tests by injecting around 10 native bots into the map and running my bot among them for the length of the sequence
as that's the only way to force it to die more than once in the same experimental run(sequence is only 40 actions long).
It never hungs up, but what is interesting is at times it gives you only warning
(ThiefBot) WARNING 10:46:44.218 Bot switched to DEAD STATE.

but another time when death happens you get

(ThiefBot) WARNING 10:46:34.140 Bot switched to DEAD STATE.
(ThiefBot) WARNING 10:46:34.140 FSMBotDeadState: unprocessed message InfoMessageZoneChangedBot | Id = CTF-1on1-Joust.ZoneInfo0 |
IN FINALLY BLOCK
HEY BOT WAS KILLED ....%o rode %k's rocket into oblivion.
Damage Type: XWeapons.DamTypeRocketby WorldObjectIdCTF-1on1-Joust.GBxBot1

This unprocessed message is what really worries me, and this message happens ocassionally. However, in previous
experimental runs when the bot hungs due to dead state, it is always accompanied by unprocessed message. And my
current assumption is that when the dead state is entered, the respawn message is not received - it never leaves the state dead
as in your leavingState method and keeps waiting for the server to issue spawn player... could it be right?

I have ruled out the JVM, I run debugging sessions on it and the heap size in its peaks only reached 15Meg out of 250meg limit.
I have also put bot stopping call into seperate thread, so this no longer causes exceptions/warnings. Finally, I have
changed agent.kill() to less violent agent.stop() to eliminate this possibilty as well. All in all, I am not getting any
exceptions apart from the dead state warning.

In terms of GameBots2004.ini modifications, I have changed few bits, of course not in a single go but through plenty of experimental runs:
BotCTFGame:
timelimit=0
maxlives=0
goalscore=0
lateentrylives=0
bForcerespawn=True — so if it dies it gets resurected :-)

However, after all this the game hungs still, agrrrrr

Any more ideas what else can I try?

Thanks a lot for helping me out,

P.
Hi!

We're currently discussing your problem and come to this conclusions:

1) just to be sure - are you really modifying GameBots2004.ini file? Please send me your version (that is possibly buggy to jakub.gemrot at gmail.com and I will test them)

2) try to reinstall UT2004 / patch it / put latest GB2004 and try it again

3) try to run cz.cuni.amis.pogamut.ut2004.bot.killbot.Test01_BotDeadState test case from SVN at your side - has the bug manfisted?
(you will find the sources in directory PogamutUT2004/test)

If 1+2+3 won't solve it:

4) try to write a test case (as simple as possible) that replicates your bug and send its sources to jakub.gemrot at gmail.com
I'm afraid we're not able to do more for you without that :-(
(You may actually try to extend cz.cuni.amis.pogamut.ut2004.bot.killbot.Test01_BotDeadState)

Cheers!
Jimmy & Ruda

P.S.: I will also put some timeouts to UT2004SyncLockableWorldView, it seems to me that this component is hanging everything...
Hello!

Ok, I will definietly go through the steps you and Ruda suggested but being me I have jumped straight to the point 3 first :-)
I run your test case and suprise suprise, straight from the word go when the bot switched to dead state an infinite loop starts...
it goes something like this:

(UT2004Bot2) WARNING 20:55:45.405 Bot switched to DEAD STATE.
(UT2004Bot2) WARNING 20:55:45.405 FSMBotDeadState: unprocessed message InfoMessageZoneChangedBot | Id = DM-TrainingDay.ZoneInfo13 |
(UCC) INFO 20:55:45.405 ID0 In: State: Dead, BeginState()
UCCPRNT In: State: Dead, BeginState()
(UCC) INFO 20:55:45.405 ID0 We are in RemoteRestartPlayer
UCCPRNT We are in RemoteRestartPlayer
(UCC) INFO 20:55:45.468 ID0 newBot.PawnClass is None. Using GBxPawn.
UCCPRNT newBot.PawnClass is None. Using GBxPawn.
(UCC) INFO 20:55:45.468 ID0 In: State: Dead, BeginState()
UCCPRNT In: State: Dead, BeginState()
(UCC) INFO 20:55:45.468 ID0 We are in RemoteRestartPlayer
UCCPRNT We are in RemoteRestartPlayer
(UT2004Bot3) WARNING 20:55:45.609 Bot switched to DEAD STATE.
(UT2004Bot3) WARNING 20:55:45.609 FSMBotDeadState: unprocessed message InfoMessageZoneChangedBot | Id = DM-TrainingDay.ZoneInfo13 |
(UCC) INFO 20:55:45.609 ID0 In: State: Dead, BeginState()
UCCPRNT In: State: Dead, BeginState()
(UCC) INFO 20:55:45.609 ID0 We are in RemoteRestartPlayer

and so on...

Next, I have stolen bit of your code responsible for respawning bot and inserted it straight into my bot

private void respawn() {
getAct().act(new Respawn().setStartLocation(params.getSpawningLocation()));
}

, next I have added call to this method inside the botKilled() method and the same loop happens.
From my previous debugging sessions similar thing happened, when code froze on the bot dead state warning
the server was issuing interwoven messages just like above,
State: Dead, BeginState()
RemoteRestartPlayer
State: Dead, BeginState()
RemoteRestartPlayer
State: Dead, BeginState()
RemoteRestartPlayer

Conclusions, when calling respawn method after bot dies does not get any response from the server. Why?
Is there some port blocked that it needs, like 3000? It seems that some settings either
in the GameBots.ini or UT2004.ini are wrong as you have pointed out before. If this is true I will eat my
shoe ;) But most importantly why the heck when I execute experiments the bug does only kick in after
some time but not at the beginning.

Thanks a ton,

P.
Hi all,

After whole day of struggle one thing is certain, issuing Respawn command kills the bot and then
initiates it at specified location. Now, if we put respawn method call in
botKilled method, litte wonder that it loops... when bot is killed it changes state
to BotDeadState. Then as the method botKilled is entered, the bot state changes
to BotStarted again, but then of course respawn() method kills it again, so the
botKilled method is invoked again but at the same time respawn is nearly finished
initiating bot and the circle closes. Similar loop happens to me during experiments... I am kinda inclined
to think that this could be something to do with GameBots2004 classes such as BotDeathMatch and its
remotePlayer function....

Thanks,

P.
Hi!

Yes I've also noticed that using Respawn in the botKilled method may lead to a disaster.

How are you issuing your respawns? If you actually issue a respawn from the bot's logic() method or outside the bot,
that you MUST NOT use respawn in the next botKilled method call. Check the Test01_BotDeadState method,
that has flag "utilRespawn" which is set to true when using respawn method from the bot's logic().

Best,
Jimmy