Linux, musical road-dogging, and daily life by Paul W. Frields
 
Notes from a happy camper.

Notes from a happy camper.

Wikimedia helpfully posted a FAQ to set the record straight about their recent server standardization. This is helpful, since a lot of people (including overzealous joinalists) were somehow under the impression that this was a sudden move that involved ripping out hundreds of installed Red Hat Enterprise Linux or CentOS boxes.

However, the FAQ also introduces some unnecessary confusion about the performance of yum, and I figured that needed to be set straight. Readers might take away from this article that current yum was somehow “performance-challenged,” when nothing could be further from the truth.

True, in the days of Fedora Core 3 and 4, i.e. about three years ago or so, it’s true that yum didn’t perform as well as it does today. A metadata parser rewrite, along with a metric boatload of other optimizations and intelligent code straightening done by the yum team, has resulted in a snappy, flexible, and incredibly useful dependency solver. In short, modern yum makes software management on Fedora systems fast and easy. The performance nowadays is probably close to two orders of magnitude improved over the FC 3/4 timeframe.

To make the situation clear to readers about how yum works on systems like Fedora 9, while acknowledging that it’s completely up to the Wikimedia folks what they do with their servers, I wrote this clarification. I included links to some helpful information posted by James Antill regarding yum benchmarks, and information on what makes yum unique.

4 Comments

  1. Nick Danger

    I would also like to point out, the performance of a package management program should be more or less irrelevant in the operation of a distributed clone server farm. The image each machine runs should be static, and take only those updates that apply to your system. Update stability and quality should trump speed. Fit your OS choice to your requirements, not your tastes, first! The add your tastes and style.

    Amusingly, and contradictingly, the FAQ featured this question:
    Q: I had a bad experience with Ubuntu two years ago, which proves it is terrible and you are idiots for using it.
    A: When I was six, another kid kicked a soccer ball right into my nose during a game, and I cried all the way home. It would, however, be a logical fallacy for me to conclude from this that soccer is a terrible game which no one should play.

    And yet earlier on states:
    Q: Why did you stop using Red Hat Enterprise Linux!?!?
    A: We never used Red Hat Enterprise Linux — we originally had an ad-hoc mix of old Red Hat 9 and Fedora 2, 3, and 4 systems, which we were interested in replacing with a more standardized infrastructure to simplify our internal server setup and administration.

    Redhat 9? Old (and from me thats saying something..) When was fedora 4, 2005? Amusing contradiction in those two answers there….

    My point? Pick what does the job first, then pick what you like. That is all the justification you need.

    (For the record, I have a few systems running Ubuntu, Fedora, and dozens for CentOS, a few OpenBSD and *shame* XP)

  2. Hi,

    >> To top it off, while RPM isn’t too awful, yum is slow and annoying as a package manager and we just don’t like it.

    > The performance nowadays is probably close to two orders of magnitude improved over the FC 3/4 timeframe.

    Note that this doesn’t actually answer their observation that other distributions use faster package managers than yum/rpm. As someone who’s deployed both Fedora and Ubuntu systems, I’d certainly sympathize with the viewpoint that yum is frustratingly slow in comparison to apt.

    Hm, I should give some numbers to avoid being piled on. I can do that.

    Ubuntu gutsy, on a dual-core AMD with 4G RAM:
    sudo apt-get install jed … 5.860 total

    Fedora rawhide, yum updated a few days ago, on a quad-core AMD with 6G RAM:
    sudo yum -y install jed … 21.735 total

    The two packages were similar in size and neither had any additional required dependencies. I feel this test was fair, and weighted on the side of the more powerful Fedora box if anything.

    So, while it’s fine to boast about how much better yum/rpm is than it used to be, it’s worth bearing in mind that you’re still over four times slower than your competitor on a good day.

    – Chris. (Who runs Fedora by choice on his laptop and workstation.)

  3. @Chris: You probably should read the other links I posted, including James Antill’s post about why comparing yum directly to some other depsolvers like apt, smart, etc. is an apples-to-oranges deal.

    I understand that you tried to provide a fair comparison, but you can’t do that with the commands you ran. For instance, yum automatically updates its metadata, which apt doesn’t. Did you count those times separately? Also, yum is parsing and computing dependency information based on a large number of factors. Yum also computes for versioned and unversioned provides, requires, conflicts, and obsoletes, as well as explicit file dependencies — not just package names. Check this functionality against that provided by apt. I’m not saying one is better than the other — I’m saying that to properly test relative performance, you need to compare apples to apples.

    Thanks for taking the time to write.

  4. Also, just for fun — again, neither scientific nor particularly enlightening — I tried this:

    su -
    yum makecache
    cd /var/cache/yum/fedora/packages
    yumdownloader jed

    OK, so that means I have a complete valid cache on disk to use, and I downloaded the jed package to eliminate the network download time from the test. (I’ll ignore hard disk time — this is just a poor man’s attempt at isolating the depsolving.) Now I’ll just do the dependency resolution:

    yum -y -C --disableplugin={fastestmirror,refresh-packagekit} resolvedep jed

    Wow, only about 0.4 seconds. The remainder of the time for a full installation was spent loading some more Python stuff (I think), a database transaction test to make sure that problems like disk space aren’t in the way, and writing the material to disk.

Comments are closed.