archive - snowblossom - slack - mining

2018-11-21 00:00:10

Rotonen

and `Binary search is a simple example that could benefit from explicit prefetching. The access pattern in a binary search looks pretty much random to the hardware prefetcher, so there is little chance that it will accurately predict what to fetch.` sounds relevant

2018-11-21 00:00:23

Rotonen

oh well, enough fun to have discovered a whole field of rabbit holes for now

2018-11-21 00:02:33

Rotonen

and this seems like almost snowblossom mining https://stackoverflow.com/questions/7327994/prefetching-examples/50280085#50280085 Can anyone give an example or a link to an example which uses __builtin_prefetch in GCC (or just the asm instruction prefetcht0 in general) to gain a substantial performance advantage? In particula...

2018-11-21 00:02:49

Rotonen

`maybe surprisingly, the less CPU-bound task the bigger the speed-up: we are able to hide the latency almost completely, thus the speed-up is` sounds so very familiar

2018-11-21 00:03:57

Rotonen

now this is just funny in regards to how much of the advancements from the past 10 years are in the way of mining snowblossom :smile: https://stackoverflow.com/a/45201673/1214697 Some CPU and compilers supply prefetch instructions. Eg: __builtin_prefetch in GCC Document. Although there is a comment in GCC's document, but it's too short to me. I want to know, in prantice, w...

2018-11-21 00:04:29

Rotonen

was not expecting power savings to be in the mess as well `On recent Intel chips one reason you apparently might want to use prefetching is to avoid CPU power-saving features artificially limiting your achieved memory bandwidth.`

2018-11-21 00:04:48

Rotonen

yeah, java does all this stuff actually well, good look to anyone implementing a miner in anything else :smile:

2018-11-21 00:17:24

Rotonen

but yeah, i guess these would be fun for arktika 8 channels per socket, 4 sockets, 100GE https://www.anandtech.com/print/13620/huawei-server-efforts-hi1620-and-arms-big-server-core-ares Huawei Server Efforts: Hi1620 and Arm’s Big Server Core, Ares

2018-11-21 00:27:43

Fireduck

INFO: RPC Server: read_ops/s: 4610388.4 rpc_ops/s: 4502.3 network_bw: 70.3 MB/s

2018-11-21 00:27:46

Fireduck

that is more like it

2018-11-21 00:29:02

Rotonen

i'll believe a poolside 1h average :stuck_out_tongue:

2018-11-21 00:30:56

Fireduck

heh

2018-11-21 00:31:02

Fireduck

you should see about 1mh in an hour

2018-11-21 00:31:49

Rotonen

i'm seeing that as a spot rate currently, but that's fluctuating between 200k and 1M

2018-11-21 00:31:58

Rotonen

i think the pool tries to tell me things in the log way too often

2018-11-21 00:33:10

Rotonen

yeah, just had a `INFO: Mining rate: 0.000/sec - at this rate ∞ hours per block`

2018-11-21 00:33:23

Rotonen

maybe it could tell me the averages every 5min or so

2018-11-21 00:35:02

Fireduck

Heh yeah

2018-11-21 00:35:43

Rotonen

what's the relation of that 1M poolside figure and the 4M from the output you posted?

2018-11-21 00:36:48

Fireduck

Those are individual value reads

2018-11-21 00:37:02

Fireduck

So 6 of those is one hash attempt

2018-11-21 00:37:26

Rotonen

that'd land you at 800kH/s, though?

2018-11-21 00:37:42

Fireduck

INFO: 1-min: 105.271K/s, 5-min: 105.735K/s, hour: 51.313K/s Nov 20, 2018 4:37:28 PM snowblossom.miner.Arktika printStats INFO: Layer 0: read_ops/s: 6995959.1 read_bw: 27328.0 MB/s Nov 20, 2018 4:37:28 PM snowblossom.miner.Arktika printStats INFO: Layer 1: read_ops/s: 0.0 read_bw: 0.0 MB/s Nov 20, 2018 4:37:28 PM snowblossom.miner.Arktika printStats INFO: RPC Server: read_ops/s: 6351366.5 rpc_ops/s: 6202.5 network_bw: 96.9 MB/s

2018-11-21 00:37:48

Fireduck

I was just getting warmed up

2018-11-21 00:38:08

Rotonen

that makes sense

2018-11-21 00:38:10

Fireduck

that is the r900 reading from ram and dishing out on network

2018-11-21 00:38:37

Rotonen

that's 8 channels it pulls off of on that side?

2018-11-21 00:38:38

Fireduck

using most the CPU to do it, not sure that adding 10g will help

2018-11-21 00:38:57

Fireduck

I have no idea how many channels it has

2018-11-21 00:39:18

Rotonen

xeon is 4 channels per socket and that's a dual socket system?

2018-11-21 00:39:54

Rotonen

and as ryzen is 2 channels per socket and limited to one socket systems, i'm a bit mystified as per its allure

2018-11-21 00:41:02

Rotonen

from my point of view you're taking only a 10% to 15% performance hit vs. local ram there with arktika

2018-11-21 00:41:22

Fireduck

4 socket, E7450

2018-11-21 00:41:42

Rotonen

that should land you around 3MH/s

2018-11-21 00:41:47

Fireduck

these CPUs seem to suck, I can only get them to mine at about 200kh/s directly

2018-11-21 00:41:56

Rotonen

if all ram is local and evenly distributed (and the reads too)

2018-11-21 00:42:25

Rotonen

that sounds more like hardware locality issues there

2018-11-21 00:43:01

Rotonen

like putting most of the field by dumb luck onto one numa node (and there asymmetrically across channels within that node too)

2018-11-21 00:43:47

Rotonen

the page cache has numa aware magic about it so i'm curious, if that R900 does have enough ram and the channels are evenly populated, what'd it pull 'off the disk' after a cat to /dev/null

2018-11-21 03:06:39

cryptovape

i have a server with 256gb DDR4 ram, Intel Xeon E5-1650 V3 getting 870kH/s by precacheing in RAM. However, with 64 GB ram with precache its only getting me 37kH/s... am wrong to expect it to get more?

2018-11-21 03:07:40

Fireduck

If you have other machines on a lan with the 256gb one you can get more out of that setup

2018-11-21 03:09:09

cryptovape

oooo... would I install Arktika on the other computers and use a remote layer to the server's IP?

2018-11-21 03:09:23

Rotonen

yes

2018-11-21 03:10:20

cryptovape

Awesome, yeah the CPU is maxed out. What is the minimum RAM i would need for field 7 RAM mining?

2018-11-21 03:22:44

Fireduck

Ideally you would be able to get entire field in ram

2018-11-21 03:23:02

Fireduck

But doesn't need to all be on one machine

2018-11-21 03:25:27

Fireduck

Any reason to not use rj45 10gbe network?

2018-11-21 03:28:00

Fireduck

Cables will be a few meters at most

2018-11-21 03:38:32

cryptovape

the DDR4 machines are rentals remotely... but I did order 144gb DDR3 ram for my R610, so I may try if CPUs are maxed.

2018-11-21 03:42:39

cryptovape

Also currently have a CPU mining setup for BOINC (for Biblepay and ByteBall) with ~25 i7-3770s 4/8gb RAM, so may want to try to incorporate that somehow.

2018-11-21 03:45:08

Fireduck

How much do those rentals cost?

2018-11-21 03:47:24

cryptovape

got the 256gb for $129...

2018-11-21 03:48:15

Fireduck

$129 per month,day,hour?

2018-11-21 03:48:19

cryptovape

month

2018-11-21 03:48:28

Fireduck

Impressive

2018-11-21 03:52:29

cryptovape

i think the 64gb was $75, but not worth it at 37kH/s

2018-11-21 06:51:22

Fireduck

you can get a little more from that using arktika

2018-11-21 06:51:35

Fireduck

so that you have a separate queue for your memory misses

2018-11-21 07:01:00

Fireduck

wouldn't expect anyhthing amazing

2018-11-21 09:36:16

Rotonen

renting 256GB one socket quad channel hardware for around 100 per month is indeed a thing, also available in europe from various vendors

2018-11-21 09:37:36

Rotonen

@Fireduck the windows penalties are not as bad as thought of, someone hit 1.1MH/s on an asymmetrically populated dual socket xeon on windows

2018-11-21 11:18:59

fydel

@cryptovape where are you renting the 256GB machine?

2018-11-21 16:02:23

Fireduck

@Rotonen shit speed:

2018-11-21 16:02:24

Fireduck

nerd@jet:/var/shm/snow/snowblossom.7$ cat snowblossom.7.snow.* | dd if=/dev/stdin of=/dev/null bs=32k 566180+0 records in 566180+0 records out 18552586240 bytes (19 GB, 17 GiB) copied, 23.8856 s, 777 MB/s

2018-11-21 16:02:58

Fireduck

nevermind, rsync was still running

2018-11-21 16:03:18

Fireduck

still terrible:

2018-11-21 16:03:19

Fireduck

nerd@jet:/var/shm/snow/snowblossom.7$ cat snowblossom.7.snow.* | dd if=/dev/stdin of=/dev/null bs=32k 814943+0 records in 814943+0 records out 26704052224 bytes (27 GB, 25 GiB) copied, 32.8304 s, 813 MB/s

2018-11-21 16:04:42

Fireduck

anyways, since I learned that doing -Xms along with -Xmx makes it easier to fit things in ram I am less concerned about my inability to quickly read from /var/shm

2018-11-21 16:20:29

Fireduck

I've determined my old Dell has enough CPU capacity to spit out some more bit so I'm upgrading it to 10 gigabit

2018-11-21 16:43:19

Rotonen

@Fireduck the dd was not what i was trying to query over, and use a 1G blocksize for speedups

2018-11-21 16:43:50

Fireduck

I just use dd since it gives me a nice report

2018-11-21 16:44:24

Rotonen

how’s the mining ’off the disk’ now that the file is in the page cache?

2018-11-21 16:45:50

Rotonen

and cat is probably the quickest way to cache the file, dd just has the sequential read benchmark aspect to it

2018-11-21 16:46:54

Rotonen

and i guess you sigint your way out at that point or sigusr1 for the interim reports as that’s only read a few tens of GB so far?

2018-11-21 16:47:13

Fireduck

This is shared memory filesystem so not sure if the page cache is even a thing

2018-11-21 16:47:30

Rotonen

it is not

2018-11-21 16:47:38

Fireduck

My benchmark mode shows it at 18gb/s

2018-11-21 16:48:02

Fireduck

Vs well over 100gb/s for jvm heap

2018-11-21 16:48:14

Rotonen

i’m curious as to what you get when you have not filled the ram with shm and have catted the field once

2018-11-21 16:48:46

Fireduck

So mine from SSD but use cat to load cache?

2018-11-21 16:48:51

Rotonen

yes

2018-11-21 16:48:53

Fireduck

I'll give it a shot

2018-11-21 16:49:41

Rotonen

should work from a spinny disk just as well too :P

2018-11-21 16:49:54

Rotonen

just slower to cache

2018-11-21 16:51:24

Rotonen

i like doing that as then you never hit the heap size nonsense

2018-11-21 17:06:30

Fireduck

16.7GB/s

2018-11-21 17:06:43

Fireduck

and I know that is cache because that SSD is terrible

2018-11-21 17:17:19

Rotonen

so about 1/6 vs. memfield?

2018-11-21 18:13:52

cryptovape

check your DM!

2018-11-21 18:20:41

Fireduck

On that system. Might have something to do with have 4 sockets.

2018-11-21 18:20:56

Fireduck

Maybe Java is doing some numa magic? Who knows.

2018-11-21 18:24:46

Rotonen

it is, but so shoukd the page cache