Data compression
Message boards : Wish list : Data compression
Author | Message | |
---|---|---|
Thanks to the size of the each file (around 1MB), when workunits are downloaded, it will be necessary for hosts with dialup to wait a long time. I estimate that the number of these hosts is large yet, so I wish workunits distributed to be compressed if possible, so that every host crunch ones as many as they can.
|
||
ID: 635 | Rating: 0 | rate: / | ||
This is actually on our todo list, but not with a very high priority right now.... Thanks for the input though!!
Thanks to the size of the each file (around 1MB), when workunits are downloaded, it will be necessary for hosts with dialup to wait a long time. I estimate that the number of these hosts is large yet, so I wish workunits distributed to be compressed if possible, so that every host crunch ones as many as they can. ____________ D@H the greatest project in the world... a while from now! |
||
ID: 670 | Rating: 0 | rate: / | ||
|
||
ID: 2682 | Rating: 0 | rate: / | ||
smaller input/output files are always welcome...
|
||
ID: 4342 | Rating: 0 | rate: / | ||
Due to the scientific requirements of the application, we had to extend the outputs. The inputs, I think, are not too big. But we will discuss this item in the next D@H meeting.
|
||
ID: 4352 | Rating: 0 | rate: / | ||
Due to the scientific requirements of the application, we had to extend the outputs. The inputs, I think, are not too big. But we will discuss this item in the next D@H meeting. When a user has an ISP that limits their usage to 5GB a month (like my provider does), the input/output filesizes can add up after a couple weeks. You can use file compression on both input and output files as one way to help this. mod_deflate the inp filetype for input and gzip_when_done for the output files. After checking the inp files, an even better savings can be accomplished but it may require changing the code so I wouldn't anticipate going this route, but here are a couple of ideas. 1) Each inp file for the same complex contains identical information except for the first line where the set seed=xxxxxx is different for each task. To save on bandwidth, split the current inp file into two files: a dat file and an inp file. The inp file would contain the set seed=xxxxxx parameter only and be about 20 bytes in length. The dat file would contain everything else that now starts with the 2nd line in the current inp files since it is always the same info. This way, the dat file of each complex would only need to be downloaded one time at the start of a new complex run. The inp file would change with each task and need to be downloaded with each task assigned, but each download would only be 20 bytes instead of over 1MB each time. An example using the current complex would be something like this: 1hvj_mod0011sc.dat 1MB single time download 46068_412698.inp 20b 1st task assigned 46956_226898.inp 20b 2nd task assigned 2) Or if possible, use a command line to start the task and have the seed input from the command line. This is similar to the way Rosetta works their tasks. I downloaded a couple of zip files containing the data needed to crunch a task, with a total of approximately 4MB combined. But then the next 8 tasks assigned to me have used the same input files and no downloads were necessary. If they worked it like Docking does, the same 4MB would have been downloaded each time with just the random seed changing. This method would require removing the seed line from the inp file, but like the first way would only require the inp file be downloaded once at the start of each new complex run. |
||
ID: 4359 | Rating: 0 | rate: / | ||
Message boards : Wish list : Data compression
Database Error: The MySQL server is running with the --read-only option so it cannot execute this statement
array(3) { [0]=> array(7) { ["file"]=> string(47) "/boinc/projects/docking/html_v2/inc/db_conn.inc" ["line"]=> int(97) ["function"]=> string(8) "do_query" ["class"]=> string(6) "DbConn" ["object"]=> object(DbConn)#11 (2) { ["db_conn"]=> resource(84) of type (mysql link persistent) ["db_name"]=> string(7) "docking" } ["type"]=> string(2) "->" ["args"]=> array(1) { [0]=> &string(50) "update DBNAME.thread set views=views+1 where id=60" } } [1]=> array(7) { ["file"]=> string(48) "/boinc/projects/docking/html_v2/inc/forum_db.inc" ["line"]=> int(60) ["function"]=> string(6) "update" ["class"]=> string(6) "DbConn" ["object"]=> object(DbConn)#11 (2) { ["db_conn"]=> resource(84) of type (mysql link persistent) ["db_name"]=> string(7) "docking" } ["type"]=> string(2) "->" ["args"]=> array(3) { [0]=> object(BoincThread)#3 (16) { ["id"]=> string(2) "60" ["forum"]=> string(1) "9" ["owner"]=> string(2) "15" ["status"]=> string(1) "1" ["title"]=> string(16) "Data compression" ["timestamp"]=> string(10) "1221105011" ["views"]=> string(4) "1369" ["replies"]=> string(1) "5" ["activity"]=> string(20) "6.3875432461995e-100" ["sufferers"]=> string(1) "0" ["score"]=> string(1) "0" ["votes"]=> string(1) "0" ["create_time"]=> string(10) "1159356987" ["hidden"]=> string(1) "0" ["sticky"]=> string(1) "0" ["locked"]=> string(1) "0" } [1]=> &string(6) "thread" [2]=> &string(13) "views=views+1" } } [2]=> array(7) { ["file"]=> string(63) "/boinc/projects/docking/html_v2/user/community/forum/thread.php" ["line"]=> int(184) ["function"]=> string(6) "update" ["class"]=> string(11) "BoincThread" ["object"]=> object(BoincThread)#3 (16) { ["id"]=> string(2) "60" ["forum"]=> string(1) "9" ["owner"]=> string(2) "15" ["status"]=> string(1) "1" ["title"]=> string(16) "Data compression" ["timestamp"]=> string(10) "1221105011" ["views"]=> string(4) "1369" ["replies"]=> string(1) "5" ["activity"]=> string(20) "6.3875432461995e-100" ["sufferers"]=> string(1) "0" ["score"]=> string(1) "0" ["votes"]=> string(1) "0" ["create_time"]=> string(10) "1159356987" ["hidden"]=> string(1) "0" ["sticky"]=> string(1) "0" ["locked"]=> string(1) "0" } ["type"]=> string(2) "->" ["args"]=> array(1) { [0]=> &string(13) "views=views+1" } } }query: update docking.thread set views=views+1 where id=60