Why no credit for this WU?
Message boards : Number crunching : Why no credit for this WU?
Author | Message | |
---|---|---|
http://docking.utep.edu/workunit.php?wuid=41117
|
||
ID: 3016 | Rating: 0 | rate: / | ||
http://docking.utep.edu/workunit.php?wuid=41117 The result stderr reads: <core_client_version>5.8.15</core_client_version> <![CDATA[ <stderr_txt> Starting charmm run... Starting charmm run... Starting charmm run... No heartbeat from core client for 31 sec - exiting Starting charmm run... SUCCESS - Charmm exited with code 0. Resolving file charmm.out... Calling BOINC finish. </stderr_txt> ]]> It's also the older 5.04 client which had the checkpointing problem. It started/restarted the work unit 4 times. My guess would be that one of those times the client had a bad checkpoint and it restarted with corrupted data. Hopefully, that's fixed in the 5.05 client that was just released. I finally set "Leave applications in memory while suspended?" to yes in the general preferences for my machines. That helps a lot, but if you turn off your machine, it still has to continue from the last checkpoint. The new client is supposed to fix that. I'm just a volunteer so I don't have access to the scientific data returned and wouldn't know how to read it if I did have access, but that corrupted checkpoint is usually the problem when you see multiple "Starting charmm run..." in the output file for the workunit on version 5.04. Hope this helps, -- David ____________ The views expressed are my own. Facts are subject to memory error :-) Have you read a good science fiction novel lately? |
||
ID: 3021 | Rating: 0 | rate: / | ||
If that is the case, should I finish or abort the remaining WUs that would be running the Charmm 5.04?
|
||
ID: 3022 | Rating: 0 | rate: / | ||
If that is the case, should I finish or abort the remaining WUs that would be running the Charmm 5.04? If you don't have enough memory/swap to set the machine to "Leave applications in memory while suspended?" or the machine is powered on and off a lot, they stand a higher chance of being bad. OTOH, I have seen some that started charmm 2 or 3 times and validated. I know that when Microsoft releases a bunch of patches for the OS and I have to reboot my machine several times during a single WU, I usually abort the WU unless it's at least 3/4 finished. It's just my personal opinion but if you don't think the WU has much of a chance of returning correct results, I'd abort it and let the computer work on a 5.05 WU. OTOH, there are probably still 5.04 WUs that need to meet their quorum so it's likely still possible to get one of those. I wouldn't advise anyone to abort a WU unless they're pretty sure it's going to return a bad result. I hope this helps. -- David ____________ The views expressed are my own. Facts are subject to memory error :-) Have you read a good science fiction novel lately? |
||
ID: 3024 | Rating: 0 | rate: / | ||
Message boards : Number crunching : Why no credit for this WU?
Database Error: The MySQL server is running with the --read-only option so it cannot execute this statement
array(3) { [0]=> array(7) { ["file"]=> string(47) "/boinc/projects/docking/html_v2/inc/db_conn.inc" ["line"]=> int(97) ["function"]=> string(8) "do_query" ["class"]=> string(6) "DbConn" ["object"]=> object(DbConn)#9 (2) { ["db_conn"]=> resource(60) of type (mysql link persistent) ["db_name"]=> string(7) "docking" } ["type"]=> string(2) "->" ["args"]=> array(1) { [0]=> &string(51) "update DBNAME.thread set views=views+1 where id=222" } } [1]=> array(7) { ["file"]=> string(48) "/boinc/projects/docking/html_v2/inc/forum_db.inc" ["line"]=> int(60) ["function"]=> string(6) "update" ["class"]=> string(6) "DbConn" ["object"]=> object(DbConn)#9 (2) { ["db_conn"]=> resource(60) of type (mysql link persistent) ["db_name"]=> string(7) "docking" } ["type"]=> string(2) "->" ["args"]=> array(3) { [0]=> object(BoincThread)#3 (16) { ["id"]=> string(3) "222" ["forum"]=> string(1) "2" ["owner"]=> string(3) "252" ["status"]=> string(1) "0" ["title"]=> string(26) "Why no credit for this WU?" ["timestamp"]=> string(10) "1176439648" ["views"]=> string(3) "997" ["replies"]=> string(1) "3" ["activity"]=> string(20) "6.1116936432093e-122" ["sufferers"]=> string(1) "0" ["score"]=> string(1) "0" ["votes"]=> string(1) "0" ["create_time"]=> string(10) "1176415674" ["hidden"]=> string(1) "0" ["sticky"]=> string(1) "0" ["locked"]=> string(1) "0" } [1]=> &string(6) "thread" [2]=> &string(13) "views=views+1" } } [2]=> array(7) { ["file"]=> string(63) "/boinc/projects/docking/html_v2/user/community/forum/thread.php" ["line"]=> int(184) ["function"]=> string(6) "update" ["class"]=> string(11) "BoincThread" ["object"]=> object(BoincThread)#3 (16) { ["id"]=> string(3) "222" ["forum"]=> string(1) "2" ["owner"]=> string(3) "252" ["status"]=> string(1) "0" ["title"]=> string(26) "Why no credit for this WU?" ["timestamp"]=> string(10) "1176439648" ["views"]=> string(3) "997" ["replies"]=> string(1) "3" ["activity"]=> string(20) "6.1116936432093e-122" ["sufferers"]=> string(1) "0" ["score"]=> string(1) "0" ["votes"]=> string(1) "0" ["create_time"]=> string(10) "1176415674" ["hidden"]=> string(1) "0" ["sticky"]=> string(1) "0" ["locked"]=> string(1) "0" } ["type"]=> string(2) "->" ["args"]=> array(1) { [0]=> &string(13) "views=views+1" } } }query: update docking.thread set views=views+1 where id=222