new build server

News and Announcements related to GhostBSD
Post Reply
ASX
Posts: 988
Joined: Wed May 06, 2015 12:46 pm

new build server

Post by ASX »

Just a quick update, I realize that this thing was discussed only on IRC.

As of May 17, we rented a new build server: Xeon E5-1650-6c/12t, 64 GB, 2 x 3 TB,
The previous server Xeon E3-1245-4c/8t, 32 GB, 2 x 2 TB, will be dismissed today.

Reminder for ericbsd: we need to change the dns, pointing to the new builder IP address, and also to perform the associated jail setup on the new builder.

~~~

In the latest months I have tested several setup, looking for the best performance, the difficult thing is that each test require a lot of processing time (days), and I think that by now we have a good starting point setup;
32 GB is the minimum required RAM, we have no use of more than 64 GB RAM;
ccache is the most critical component when using synth, and it is the (only ?) component that would benefit on residing on an SSD/NVME disk.
Swap is not used at all on 64 GB system, and had a very limited use on 32 GB system.

I launched two builds on the two server, approx. 19 hours ago:
Xeon e3-1245: 4400 pkgs built, rate: 238 pkgs/hour
Xeon e5-1650: 5736 pkgs built, rate: 300 pkgs/hour

Note that this is a first run, where the ccache need be build up, subsequent run should double the output.

An interesting note is about UFS vs. ZFS:
ccache run definitely better on UFS, ZFS turned out to be slow when writing lot of small files, therefore we explicitly used UFS for ccache. (ada1)
The remaining component are on UFS for the Xeon 1245, and on ZFS for the Xeon 1650:
it turned out they perform comparably:

Code: Select all

Xeon 1245: # iostat
       tty            ada0             ada1            pass0             cpu
 tin  tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
   0  1053 35.29  91  3.12  58.50  81  4.64   0.00   0  0.00  78  0 12  0 10

Xeon 1650: # iostat
       tty            ada0             ada1            pass0             cpu
 tin  tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
   0   722 52.14  18  0.90  58.33 157  8.93   0.44   0  0.00  74  0 16  0 11
As you may see, the idle time is nearly the same on both system, the "system" is a bit higher in the case of the ZFS system, most likely due to the fact that there we used 12 builder in parallel (7 builder on UFS).

The use of ccache (ada1 for both), is of course higher for the system running more builder;
the disk activity for the other components (ada0) is higher for the UFS system, and lower for the ZFS one, that's where the ZFS ARC cache play a big role.
ASX
Posts: 988
Joined: Wed May 06, 2015 12:46 pm

Re: new build server

Post by ASX »

Once again we are meeting the same issue: loss of performance after a long synth build session.

Just completed the amd64 repository, and exactly for test I move the repository and restarted the build:
it would start run at 600 pkgs hour;
rebooted the server, restarted the build, and it immediately peak at over 2000 pkgs hour and stay above 1000 for the first 15 minutes (then I stopped the build).

Something is wrong with FreeBSD OS and I don't know what it is. Quite depressing.

It appears that rebooting always restore the expected performance, that subsequently will degrade a few at a time...
kraileth
Posts: 312
Joined: Sun Sep 04, 2016 12:30 pm

Re: new build server

Post by kraileth »

ASX wrote:Just completed the amd64 repository, and exactly for test I move the repository and restarted the build:
it would start run at 600 pkgs hour;
rebooted the server, restarted the build, and it immediately peak at over 2000 pkgs hour and stay above 1000 for the first 15 minutes (then I stopped the build).

Something is wrong with FreeBSD OS and I don't know what it is. Quite depressing.
Hm! This is not cool at all. But since FreeBSD prides itself in being a highly performant OS, I would imagine that there's something special going on here. They are building packages, too, after all and have done so for quite some time. Such a radical drop in performance wouldn't have went unnoticed and I'm sure it would have been fixed long ago.

Here are some assumptions (please correct them if you know that they are wrong):

a) Synth is used on DragonFly, too, and is probably working well there
b) Both FreeBSD and TrueOS use Poudriere instead of Synth to build their packages (don't know what others like MidnightBSD, HardenedBSD, etc. do)
c) Synth and Poudriere serve the same purpose but are quite different on a technical level

It would be interesting to know if our system behaves the same using Poudriere. If it doesn't, this would be a clue that FreeBSD has a problem with something that's specific to Synth - which would in turn narrow down things quite a bit.

Gut instinct says that it's probably something regarding memory. I'd guess that FreeBSD somehow chokes on the many builders that Synth uses or something like that.

If we manage to somewhat narrow down possible causes, we can probably find some help among people who are proficient with DTrace and look at the issue. Even though it might be related to Synth, I don't believe that the problem is with the tool but rather it's exposing an issue within FreeBSD. And in such a case there should be some people around interested in getting it fixed.
ASX
Posts: 988
Joined: Wed May 06, 2015 12:46 pm

Re: new build server

Post by ASX »

kraileth wrote:
ASX wrote:Just completed the amd64 repository, and exactly for test I move the repository and restarted the build:
it would start run at 600 pkgs hour;
rebooted the server, restarted the build, and it immediately peak at over 2000 pkgs hour and stay above 1000 for the first 15 minutes (then I stopped the build).

Something is wrong with FreeBSD OS and I don't know what it is. Quite depressing.
Hm! This is not cool at all. But since FreeBSD prides itself in being a highly performant OS, I would imagine that there's something special going on here. They are building packages, too, after all and have done so for quite some time. Such a radical drop in performance wouldn't have went unnoticed and I'm sure it would have been fixed long ago.
If this is an issue, (intended as something fixable) it indeed exists from long time ago. it seems hevaily aggravated when using ZFS, but I have no much clue more than that
Here are some assumptions (please correct them if you know that they are wrong):

a) Synth is used on DragonFly, too, and is probably working well there
Yes, as far as I know, and considering the reports about the pkgs rate, yes, it work well there.
(1200 pkgs hour, on a dual Xeon for a total of 16 core/32 thread 128 GB ram and nvme disks).
For sure dragonfly did some optimization when parallelizing certain tasks, I talked about that with marino and he confirmed that dragonfly kernel has been improved about some specific feautures.
(basically I noticed some slowdown upon unmounting filesystems, and indeed it is an area where dragonfly perform better).
b) Both FreeBSD and TrueOS use Poudriere instead of Synth to build their packages (don't know what others like MidnightBSD, HardenedBSD, etc. do)
Yep, the problem now is that the only person I trust is marino, I had contact with a freebsd developer too, we also give him access to the server, but in the end it appeared out his prejudice against synth, he suggested to switch to poudriere, but also stated that he didn't tried synth.
c) Synth and Poudriere serve the same purpose but are quite different on a technical level
they are certainly different, but they use the same rebuild logic, additionally poudriere rebuild/restart a jail for each package and that is an overhead, but also can take advantage of zfs snapshot, if used from zfs.

I will add that before finding synth, I looked several times at poudriere docs (from at least 1 year ago, if not more) and I I never liked it very much. (mainly obscure/unclear documentation).
It would be interesting to know if our system behaves the same using Poudriere. If it doesn't, this would be a clue that FreeBSD has a problem with something that's specific to Synth - which would in turn narrow down things quite a bit.
Yeah, it would be interesting, and we tried something... what more then allowing a freebsd dev to access our machine ?
But in the end it resulted we cannot trust such developer, I don't trust him.
At the same time I'm 100% sure it is not a synnth issue, because the problem solved itself always upon rebooting the machine (for some time at least).
Gut instinct says that it's probably something regarding memory. I'd guess that FreeBSD somehow chokes on the many builders that Synth uses or something like that.
I have exactly the same impression.
If we manage to somewhat narrow down possible causes, we can probably find some help among people who are proficient with DTrace and look at the issue. Even though it might be related to Synth, I don't believe that the problem is with the tool but rather it's exposing an issue within FreeBSD. And in such a case there should be some people around interested in getting it fixed.
The dev I mentioned was exactly inspecting the system using dtrace, unfortunately, considering his prejudice, I think he is more interested to demonstrate it is NOT a freebsd issue.
ASX
Posts: 988
Joined: Wed May 06, 2015 12:46 pm

Re: new build server

Post by ASX »

A follow up, about lack of performance:

1) "All SATA controllers are NOT created equal"

Code: Select all

1x 2TB a single drive - 1.8 terabytes - Western Digital Black 2TB (WD2002FAEX)

 Asus Sabertooth 990FX sata6 onboard ( w= 39MB/s , rw= 25MB/s , r= 91MB/s )
 SuperMicro X9SRE sata3 onboard      ( w= 31MB/s , rw= 22MB/s , r= 89MB/s )
 LSI MegaRAID 9265-8i sata6 "JBOD"   ( w=130MB/s , rw= 66MB/s , r=150MB/s )
source: https://calomel.org/zfs_raid_speed_capacity.html
note that our e5-1650 server is using a SupeMicro MB, although the disk performance on our system don't appear to be affected so severely.

2) ccache works really well up until a certain size, but performance decrease a lot due to some difficult to manage a 100 GB ccache, from 3M files, average size 24 kb.

UFS add a big penalty when using lot of files in a single directory, and increasing the "cache_dir_levels" /from 2 to 4) doesn't appear to improve the situation.

ZFS also has some performance penalty when dealing with lot of small files, although I have managed to increase the R/W performace (by reducing the dataset "recordsize").

3) there is a scalability issue that affect the FreeBSD kernel, precisely around the "umount":
synth use extensively tmpfs and null_mounts, up to 24 for each builder/package built.

It appears that some syscall will require exclusive access when executed (either unmount or freeing a tmpfs or both), it means that each these unmount need to be serialized and cannot be executed in parallel.

Additionally, upon "umount" a filesystem, FreeBSD call "sync", which will cause the cached blocks to be written to the filesystem. (upon a synth pkg "deinstall" phase, sync(() will be called 24 times ... for each builder).

The net result, when building lot of small packages, is that all this umount will happen to be simultaneous (thanks sync() call), and ultimately the CPU will be unused, waiting 30 / 40 seconds to complete the serialized unmounts.

Much worse that that, it also appear that after multiple hours of "building", the unmount become even slower, at least on a ZFS filesystem, --- cause unknown ---

If nothing else, all the above push us toward multiple builders, possibly use SSDs, or "striping" as fallback, because clearly disk I/O is going to affect the builder performance.

FYI only. ;)
Post Reply