Upgrading Fedora
General issues in upgrading Fedora
dnf install dnf-utils, meld
Before Upgrading
dnf update
package-cleanup --problems
package-cleanup --dupes
package-cleanup --orphans
rpmconf -a (using meld to conflate)
-
After Upgrading
dnf distro-sync
dnf update
package-cleanup --problems
package-cleanup --dupes
package-cleanup --orphans
rpmconf -a (using meld to conflate)
Some useful scripts
rpm -e --justdb --nodeps packagename
Special Issues, Upgrading / Backing up Multi-Partition Hosts
Use live USB stick to boot so system is quiescent, then use parted to visualise structure
select /dev/sdx
print
then
dd if=/dev/sda9 of=/dev/sdx4 bs=128M
watch -n 10 killall -USR1 dd
Special Issues, Upgrading VirtualBox Hosts
Uninstall virtualbox before upgrade. Upgrade, then reinstall virtualbox (or you could try using the yum repo for virtualbox provided by rpmfusion, but so far I've found it too painful - too many glitches with the rpms - and directly installing rpms from www.virtualbox.org much easier).
Special Issues, Upgrading VirtualBox Guests
Special Issues, Upgrading Cluster
Create new INSTALL and RUN directories, and export via nfs
Download appropriate DVD iso, and copy to INSTALL from command line (don't use File Manager because of name length limitations - symptoms, will fail to find repodata).
Set permissions of INSTALL subdirectories
Copy INSTALL scripts from previous version of fedora
Copy RUN contents from previous version of fedora
Now using yum via kaya, having switched kaya routing on
Do make sure to disable the updates repo in all yum commands; we can't handle the load of updating all cluster machines
Probably should set up a yum proxy on kaya, but I don't have time
-
Copy previous directory in tftpboot to new version
Copy new TRANS.TBL, vmlinuz, initrd.img from nnn/INSTALL/isolinux
cd to pxelinux.cfg, and edit ANACONDA_TEMPLATE.cfg and makecfgs to update version of fedora and location of install contents
Try a test installation of a basic system (same HD structure) from DVD; save kickstart file from root directory to a USB drive, and compare with previous ANACONDA_TEMPLATE.cfg
Run makecfgs to update version of fedora and location of install contents
Test by installing on cluster machine. If it reboots OK, leave running and proceed to upgrade cluster controller
If updated client appears in xpbsmon after upgrade, upgrade all other cluster clients (else debug)
Special Issues, Upgrading sc and sc1
Use dd to make copies of all system partitions to backup boot disk
dd bs=64k if=/dev/vg_sc1_f15plus/lv_xxx of=/dev/vg_sc1_f14plus/lv_xxx for xxx in var, root, usr, usrlocal
Update /sda1 boot config to reflect this
Test the backup works directly from BIOS
Now trying a new version (for both sc and sc1) and documenting here in case it helps someone else. Basically, our servers have the structure of a slow but reasonably large 2.5“ disk as the boot disk and a RAID array (detailed type largely doesn't matter, but for the record, software RAID 0 striped over hardware RAID 1 on sc - done this way because of hardware limitations - and hardware RAID 5 on sc1). The basic idea is to keep a base system on the 2.5” disk (sda for concreteness), and the production system on the RAID array (md for concreteness). This isn't working yet, but I'll describe it anyway. sda1 contains the typical fedora boot partition. sda2 contains an lvm volume group, sc, which contains logical volumes root and home (with obvious meaning). Actually, the detailed structure is more complex, but this will do for now. The RAID array contains another vg, scfast, also with lvs root and home. When the system is in normal running, it boots from sda1 to scfast/root and scfast/home. These are yum-updated normally. When an upgrade is needed, /boot is dd-ed to a partition on md so we have an emergency backup. Booting is from sc/root and sc/home, and all other mounts from scfast are commented out in fstab (i.e. the system boots entirely from sc), and scfast is actually physically disconnected to ensure that no screw-ups can occur. sc is upgraded in normal fashion (if anything goes wrong, we still have a running production system on scfast). When sc is running and stable, we go through this process (partially courtesy Andy Botting):
Add a snapshot volume for sc/root: lvcreate –size 5G –name root_snap /dev/sc/root
(this is necessary so that we get a different UUID, since dd copies everything, including UUIDs, and lvchange still provides no mechanism for changing UUIDs)
dd bs=128M if=/dev/sc/root_snap of=/dev/scfast/root
lvremove /dev/sc/root_snap (we no longer need it)
uuidgen (to create a brand spanking new UUID for the ext4 filesystem on /dev/scfast/root)
tune2fs -U <value generated by uuidgen> /dev/scfast/root
Make sure that /etc/default/grub contains all modules needed for any RAID (e.f. md)
grub2-mkconfig -o /boot/grub2/grub.cfg
This will create updated grub entries for booting from sc
Copy the top entry (the whole entry) from /boot/grub2/grub.cfg to /etc/grub.d/40_custom
Edit as appropriate (i.e. changing sc to scfast as appropriate, and updating UUIDs - blkid is your friend to find these out)
Edit fstab correspondingly
grub2-mkconfig -o /boot/grub2/grub.cfg
This will create a menuentry for booting to scfast
dd /sda1 to a safe backup
Try to boot
If it succeeds
Edit /etc/default/grub as appropriate
grub2-mkconfig -o /boot/grub2/grub.cfg
Check out /boot/grub2/grub.cfg - somehow you need to make scfast the default boot system
Repeat the process, to generate alternate boot entries for booting to sc
Special Issues, Upgrading sc1
Disable unbacked in fstab
Reboot
Upgrade (e.g. by preupgrade)
cd /root/RocketRAID/rr232x-linux-src-v1.10/product/rr232x/linux
make
make install
reboot
re-enable unbacked in fstab
reboot
Special Issues, Upgrading to Fedora 21
Only just started on this, I expect this list to grow…
A number of packages haven't been properly updated for F21, so they don't get upgraded properly. As of Dec. 23, the list includes wget, samba and sqlite. The cause seems to be a misunderstanding between the package maintainers and the fedup maintainers about the required relationship between F20 and F21 version numbers for things to work properly. You can fix this after upgrade by running “yum downgrade” on the affected packages.
The ATI Rage 128 video driver doesn't get upgraded properly; you may be able to fix this by booting into single user mode and issuing yum downgrade xorg-x11-server-Xorg (if you can't, you will be left with a system that can't get to a login screen). However I can't test this because…
There is a further problem with the assignment of network interfaces, so (especially if you need fixed ip addresses), the network may require further configuration. You can either do this manually, or - if you have a video interface (see above) - use the networkmanager gui.
Bottom line, currently there are too many bugs that may interact, creating catch-22s as above. Fortunately, so far I have only been upgrading virtual machines. One is running F21 OK, I think I'm going to have to revert the other. I won't be trying to upgrade real machines till things are a bit more stable.
Special Issues, Upgrading to Fedora 20
The biggy here seems to be that it's best to disable all external repos (and specifically, rpmfusion) before running fedup
yum repolist all | less
yum-config-manager --disable 'rpmfusion*'
Run fedup
yum-config-manager --enable 'rpmfusion*20*'
yum repolist all | less
(just to check)
-
sudo sed -i 's# -a -d /system-upgrade##' /lib/systemd/system-generators/system-upgrade-generator
The network interface naming protocol has changed _yet again_, I needed to reconstruct the firewall to match
On one system running a dhcp server, dhcp startup was failing (probably a timing race), and I needed to put into rc.local
systemctl restart dhcpd
On the same system I had a similar problem with a tftp server, tftp startup was failing (probably also a timing race), and I needed to put into rc.local
systemctl restart tftp.socket
Special Issues, Upgrading to Fedora 19
F19 VirtualBox guests: fstab mounting of virtual box shared folders (type vboxsf) seems to have stopped working (this may be just an initial problem), so they need to be mounted manually, say from rc.local. However the availability of share mounting seems to take some time. I've had to put a 'sleep 30' in the initscript. And unfortunately…
rc.local seems to have been definitively removed from the default list. To enable it again:
create rc.local in /etc/rc.d:
-rwxr-xr-x. root root system_u:object_r:initrc_exec_t:s0 rc.local
systemctl enable rc-local.service
systemctl start rc-local.service (to test)
The old system-config-network gui seems to have definitively stopped working (I found some guides to getting it working again around the net; unfortunately I wasn't able to get their recipes working for me). So it looks like it's command-line for the future
Unless you want to switch to NetworkManager, which seems to be fine for desktops, but still has too many bugs for server use in my experience…
Fedora 19 has dropped support for the old-style procfs based drivers - which is what the older HighPoint RocketRaid drivers are built on. This was a big problem for us, as we didn't have a larger PCIE slot available (so we couldn't use any of the more modern RAID controllers), yet rewriting the driver to use sysfs was a very daunting prospect. So all kudos to HighPoint. They came to the rescue by rewriting the driver to use sysfs, and supplied me with rr232x-linux-src-v1.10.1, despite having told us for at least three years that they were not providing support for these drivers any longer. _Thank you_. I'd love to be able to provide copies here, but unlike their newer drivers, it doesn't seem to have been
GPL'ed, so I can't. But I'm sure if you write to them nicely they'll be helpful. Cross fingers, we won't face any more driver changes in Fedora for the 18 months or so that these machines need to last us…
Special Issues, Upgrading to Fedora 18
Fortunately, F18 doesn't seem to have many special issues beyond…
If you are running an iptables firewall (for example, one configured by fwbuilder), you need to be aware that upgrading to F18 has installed firewalld as an alternative, and that it becomes the default in F19. If you want to continue to use your old firewall rules, somewhere between F18 and F19 you need to do
systemctl disable firewalld.service
systemctl stop firewalld.service
Unfortunately you can't easily remove it completely because NetworkManager has a yum dependency on it
systemctl status firewalld.service
systemctl status iptables.service
just to make sure
If you are using specific eth<n> style names for interfaces using udev rules and config files, this will probably stop working somewhere between F18 and F19 because of the repeated stuff-ups with this code. Fedora 18 interface naming lasted one whole release; the new naming convention in F19 is just as bad (just as susceptible to slot reconfiguration). If you want to stick to the earlier (and more reliable) approach based on MAC numbers, you can still do it via udev rules, but you need to change the naming base from eth to something else (because of conflicts with the kernel interfaced naming). To do this,
Rename the interfaces in /etc/udev/rules.d/70-persistent-net-rules to some other base name (I renamed eth<n> to sceth<n>; this simple change meant that the names got attached to the interfaces specified in the rules, whereas the original eth<n> names got all confused).
Rename the corresponding files in /etc/sysconfig/network-scripts, /etc/sysconfig/networking/devices and /etc/sysconfig/networking/profile/default appropriately. I found this useful: for i in *eth[0-9] ; do j=`echo $i | sed -e 's;eth;sceth;'`; mv $i $j; done
Make corresponding changes in all firewalls
Special Issues, Upgrading to Fedora 17
Please see F16 issues below (many issues are repeated for F17). Especially, for RocketRaid, note that the kernel changes to 3.4 soon after the upgrade, so you will need to change the sources as below to reflect this.
Bugs: there seem to be a lot.
The real biggie is
bug 820351. It means that upgrading from a DVD doesn't work, and will leave your yum configuration in a seriously screwed-up state unless you enable the network, and the (F17) updates repo, during the upgrade. My recommendation: if you don't know yum reasonably well, don't upgrade from a DVD.
If you do get caught with this, do a yum remove of the package causing the problem (this may change as the packages get updated; at the moment, the problems are turning up with one of the lib-sane packages and with one of the cups libraries; both can safely be removed. If you see a whole slew of dependent packages being removed, better not to do it…). Then do yum –skip-broken distro-sync. Finally, run package-cleanup –orphans and manually yum remove all the orphaned packages (check carefully first, some of the packages might be ones you manually installed).
If you have a separate /usr partition, you're in for more excitement - exactly which depends on your upgrade method.
If you use preupgrade, you'll find that the boot fails (because essential bits are in /usr). To fix this, you need to add rd.lvm.lv=vg_sc4/lv_usr to the boot command line on the first boot. Then you need to go into /etc/default/grub and edit the GRUB_CMDLINE_LINUX to something like:[CODE]GRUB_CMDLINE_LINUX=“rd.lvm.lv=vg_sc4/lv_root rd.lvm.lv=vg_sc4/lv_usr rd.lvm.lv=vg_sc4/lv_swap”[/CODE], and finally run grub2-mkconfig -o /boot/grub2/grub.cfg, after which your system will probably run somewhat normally
If you use an install DVD for the upgrade, somehow /usr gets set to read-only during the update. This manifests by the system saying that there isn't enough space to install something. You can manually do mount -o rw/remount /usr, and then run yum normally.
Some binaries seem to have disappeared from the installation repo; I needed to omit:
libsigc++
hdparm
For the initial pxe/tftp boot config (which pivots to boot from nfs), I needed to change the kernel append commmand:
Original:
append initrd=initrd.img ks=nfs:192.168.<NETNUM>.1:/tangof17/INSTALL/C0A80<NETNUM>.cfg ramdisk_size=500000 devfs=nomount text dns=192.168.<NETNUM>.1 ip=dhcp ksdevice=eth0
New:
append initrd=initrd.img ks=nfs:192.168.<NETNUM>.1:/tangof17/INSTALL/C0A80<NETNUM>.cfg ramdisk_size=500000 devfs=nomount text dns=192.168.<NETNUM>.1 ksdevice=bootif
ipappend 2
For pxe/nfs installs, there is a problem on the reboot after the install: networkmanager may be stopped before nfs is unmounted (probably depends on the relative speeds of different bits of hardware) resulting in the system hanging. For this, I needed to put
this img file (see
Fedora 17 problems page) into <Expanded ISO directory>/images to fix this.
system-config-firewall has a serious bug: in a newly installed Fedora 17, the initial state of the firewall doesn't match what system-config-firewall shows. You have to first save it, by for example setting ssh to disabled, saving, setting back to enabled, and saving again…
A change from previous versions, sshd isn't enabled at installation. You have to run
systemctl enable service
after installation
Also some positives:
Realtek rt2500 wireless card support seems to be working again (for the first time in about a year)
Special Issues, Upgrading to Fedora 16
Fedora 16 seems to have had the buggiest upgrade process yet (though it looks quite a nice system _once_ you get it going). This is my list of needed fixes:
-
If you are using a 64-bit system, you will need to boot first into runlevel 3 (add init 3 to your boot parameters) and use yum to uninstall the 32-bit version of caribou that got installed in the upgrade, and
install the 64-bit version instead
-
Client:
Create your openvpn.conf, e.g. /etc/openvpn/scopenvpn.conf
ln -s /lib/systemd/system/openvpn@.service /etc/systemd/system/multi-user.target.wants/openvpn@scopenvpn.service
You should be able to do the above with systemctl enable openvpn@scopenvpn.service, but for some reason it fails
systemctl start openvpn@scopenvpn.service to check it works
-
Host:
Exactly as above , but the enable doesn't seem to work (I think openvpn is being started too early, so it fails)
In /etc/rc.d/rc.local, put systemctl restart openvpn@scopenvpn.service
If you are hosting nfs
The nfs server probably won't start: systemctl start nfs-server.service followed by systemctl enable nfs-server.service
If you have ever edited /etc/sysconfig/nfs, it will no longer work. You need to backup /etc/sysconfig/nfs to /etc/sysconfig/nfs.bak, move /etc/sysconfig/nfs.rpmnew to /etc/sysconfig/nfs, then edit /etc/sysconfig/nfs to conform to whatever changes you had in /etc/sysconfig/nfs.bak, and finally systemctl restart nfs-server.service (check rpcinfo -p <nfs-server-name> to see whether you have this problem; if so,
you won't see nfs, nfs_acl or nslockd demons)
The level of duplicate rpms from yum seems to be much higher than usual. You probably should
try to remove them
In this upgrade, rhgb and quiet (and probably other kernel options) get re-enabled in grub. To fix this, edit /etc/default/grub, then grub2-mkconfig -o /boot/grub2/grub.cfg. However this will probably
mess up the grub defaults. If you want it to behave in the natural way, and reboot into the previously-selected kernel, you need to add GRUB_SAVEDEFAULT=true as well to /etc/default/grub, before doing grub2-mkconfig. If you are running F16 in a virtualbox guest, you probably need to add divider=10 as well.
If you are using nx, the keyfile /var/lib/nxserver/home/.ssh/authorized_keys2 gets moved to authorized_keys2.disabled and needs to be moved back for nx to work
-
If you are using an older Highpoint RocketRaid adapter, Highpoint haven't updated the drivers in quite a while. You will need to apply
previous mods for 2.6.30 kernels (thanks to Niels Horn), and then add further modifications to tell the scripts to compile for a 3.1 kernel. Please note that this is
very risky; you use this based on your own expertise, it is quite possible that even for my specific hardware, there could be problems down the track - and I have no idea about yours. If you aren't familiar at minimum with C, shell scripting and linux structure, please don't try, the risk is far greater than any possible benefit. Anyway, in my case, the changes were reasonably extensive, so please go through by hand and compare.
Here is what I think was the original version I started from.
Here is my changed (and apparently working) version. To the best of my recall, the files I changed were:
in /root/RocketRaid3.1/rr232x-linux-src-v1.10/inc/linux Makefile.def
in /root/RocketRaid3.1/rr232x-linux-src-v1.10/osm/linux patch.sh osm_linux.c install.sh
The recent upgrade to kernel 3.6 causes further problems. All kudos to ZoZo on the ubuntu forums, who discovered that:
I succeeded in compiling the drivers, but they're the rr2340, not rr62x. What I did was look for calls to kmap_atomic and kunmap_atomic in the source code (under os_linux.c and osm_linux.c in my case), and removed their second argument (HPT_something). Then I deleted the #define lines referring to KM_BIO_SRC_IRQ (under osm_linux.h in my case), they weren't needed anymore. Then 'make install' and voilà. You can try the same technique on your side.
Worked for me too…
…and 3.7 brings even more joy. For some unknown reason, the kernel maintainers have decided to move things around, so I found it necessary to make the following changes around line 80 of RocketRAID3.7/rr232x-linux-src-v1.10/inc/linux/Makefile.def:
#
# change KERNELDIR according to your system
# Kernel 3.7 moved all the directories around... https://lkml.org/lkml/2012/7/20/419
#
ifndef KERNELDIR
KERNELDIR := /lib/modules/$(shell uname -r)/build
endif
KERNELSRC := /usr/src/kernels/$(shell uname -r)
#KERNEL_VER := 2.$(shell expr `grep LINUX_VERSION_CODE $(KERNELDIR)/include/linux/version.h | cut -d\ -f3` / 256 % 256)
#KERNEL_VER := 3.$(shell expr `grep LINUX_VERSION_CODE $(KERNELDIR)/include/linux/version.h | cut -d\ -f3` / 256 % 256)
KERNEL_VER := 3.$(shell expr `grep LINUX_VERSION_CODE $(KERNELSRC)/include/generated/uapi/linux/version.h | cut -d\ -f3` / 256 % 256)
ifeq ($(KERNEL_VER),)
#$(error Cannot find kernel version. Check $(KERNELDIR)/include/linux/version.h.)
$(error Cannot find kernel version. Check $(KERNELSRC)/include/generated/uapi/linux/version.h.)
endif
If you are using a RAlink wireless card, check whether you are using the RTxx00 driver. If so, it has been broken since kernel 2.6.40, giving system crashes (RT2500 and perhaps other cards) and very slow, unreliable connections (RT2800 and probably other cards). While in F15, you can regress to the 2.6.38 kernel, which seems to be fine. If you upgrade to F16, this option is removed. I would strongly recommend not upgrading till the kernel/driver issues are fixed (see
bug 731672 and
bug 753648).
Replacing GNOME with LXDE
I'm sure Gnome 3 has its good points. So far, I haven't had a chance to find them, because it fails to work properly on most of our hardware, so that systems become unusable. I've found it necessary to switch to lxde instead. My guess is that Gnome 3 is fine if you happen to have a gamer-style machine with a high-end graphics card. If your machine is a scientific machine, optimised for computation, it's just luck whether you have a graphics card Gnome 3 supports properly. Of course, if you are installing fedora from scratch I would strongly recommend using the lxde spin. If you have a Gnome system that you need to convert to lxde, here are the steps I used:
Install the right software:
yum install imsettings-lxde lxde-common lxde-icon-theme lxdm lxmenu-data lxpanel lxsession lxappearance lxinput lxlauncher lxmusic lxpolkit lxrandr lxsession-edit lxshortcut lxsplit lxtask lxterminal
Actually, I don't think this is everything, but it's enough to work, and I haven't been able to find the missing modules (feedback on this would be greatly appreciated).
So that the system uses the lxde login manager (important - the gnome login manager often times out on scientific systems), and that users get lxde desktop sessions by default, modify (or create) /etc/sysconfig/desktop to contain:
PREFERRED="startlxde"
DISPLAYMANAGER="/sbin/lxdm"
Just to be sure, chcon -u system_u /etc/sysconfig/desktop