How to install Oracle 9iR2 RAC on SuSE Linux Enterprise Server 8 ================================================================ --- Using the PolyServe Matrix Server MxS 1.2 --- (with cluster filesystem) We tested and wrote this documentation at PolyServe HQ (Portland, OR). The ORACLE_HOME was put on the cluster filesystem in addition to the data files! Oracle requires some directories to be per-node and not shared. These are the Oracle Cluster Manager directory (at least the configuration file and the log directory) and all other log directories. The Polyserve cluster filesystem supports node dependent files and directories via Context Dependent Symbolic Links (CDSL) which are symbolic links that resolve into a different file on each node. The advantage of keeping all files on the cluster filesystem including the non-shared ones is that they can be accessed and edited on any node in the cluster. The test hardware used was a 12 node cluster. Additional documentation for PolyServe MxS 1.2: * Whitepaper: "Create an Oracle Home directory using Context Dependent symbolic links", http://www.polyserve.com/..... * PolyServe Matrix Server Administration Guide Requires at least these version numbers: * United Linux 1.0 (SuSE Linux Enterprise Server 8 is "Powered by UL 1.0") * UL Kernel update for Oracle, at least these version numbers: k_smp-2.4.19-196.i586.rpm - SMP kernel, for almost all Oracle users k_deflt-2.4.19-207.i586.rpm - Single CPU k_athlon-2.4.19-200.i586.rpm - Optimized for AMD Athlon k_debug-2.4.19-164.i586.rpm - Debug kernel k_psmp-2.4.19-201.i586.rpm - Support for *very* old Pentium CPUs kernel-source-2.4.19.SuSE-152.i586.rpm As of United Linux Service Pack #2 the various binary kernel packages all have the exact same version numbers. Before SP2 the last number was different for each rpm although they were all made from the same kernel sources. * orarun.rpm: Version 1.8 or greater Tip: There is an easy way to work on all nodes simultaneously! Simply open a KDE "Konsole" and in it open one terminal for each node. Now log into each node on each of the terminals. After that, under "View" in the "KDE Konsole" menu enable "Send Input to All Sessions" for one of the terminals. Now, whatever you type in this session is also sent to the other sessions, so that you work on all nodes simultaneously! This greatly reduces the amount of typing you have to do! If you do that, remember a few things: The node names, IPs etc. will be different on each node. The shell history may be different on each node. "vi" remembers where in a file you left off - so if you edit a file on all nodes simultaneously first check that the cursor is in the same position in the file on all terminals. And so on - check what's going on on the other terminals often (SHIFT left/right arrow makes this a very quick and painless exercise)!!! Tip: Use "sux" instead of "su" and X-server permissions and setting DISPLAY happens automatically! Tip: If you work in a noisy server room: Get a Bose noise canceling headset which is sold for frequent flyers. We found it very valuable in server rooms, too! ALL NODES --------- - Install SLES-8 and the latest Service Pack or updates from the SuSE Maintenance Web. Tip: You may have to add "acpi=off" to the boot options if the system hangs during boot! Selecting "Safe Settings" as boot option includes that. - Tip: From here on you should use the KDE Konsole feature mentioned earlier... - Get and install the latest version of the orarun package: wget ftp://ftp.suse.com/pub/suse/i386/supplementary/commercial/Oracle/sles-8/orarun.rpm rpm -Uvh orarun.rpm Make sure to satisfy all dependencies! - Set the password for user oracle: as root do "passwd oracle" - Optional: create extra /home/oracle: cp -a /etc/skel /home/oracle chown oracle:oinstall /home/oracle usermod -d /home/oracle oracle Alternatively, maybe you would like to change just the home directory to wherever you plan to have your Oracle home directory. - Remove gcc 3.2 ("rpm -e gcc --nodeps") to be sure it's not used - we prefer an error message during installation over inadvertently using gcc 3.2 (If you choose not to remove it you have to edit $ORACLE_HOME/bin/genclntsh as well as $ORACLE_HOME/bin/genagtsh and add "/opt/gcc295/bin" *in front* of the PATH variable set in those scripts! Then do "relink all" after the Oracle installation has finished.) - Just in case... some Oracle makefiles don't look for "gcc" but for "/usr/bin/gcc". We found network/lib/env_*.mk as an example during 9.2.0.3 patchset installation. As "root" do this: cd /usr/bin ln -s /opt/gcc29/bin/gcc gcc - In file /etc/sysconfig/suseconfig set CHECK_ETC_HOSTS="no" BEAUTIFY_ETC_HOSTS="no" - Set up /etc/hosts - Set up the network interfaces (internal/external). - Enable rsh anc rcp (without password; except for root) * Set up /etc/hosts.equiv (for rsh): Add all node names (defined in /etc/hosts - for security and reliability reasons we prefer not to rely on external DNS lookups for this!) * Edit /etc/inetd.conf (or use yast2 module for inetd), remove the "#" in front of "shell..." and "login...", one service is needed for "rsh", the other one for "rcp" (There are two lines for each, the one with the additional "-a" option does hostname verification via reverse lookup before accepting a connection. Since we don't rely on DNS - see above - we don't need this.) * As root, do "chkconfig inetd on" and "rcinetd on" (for immediate start) * Check if you can "rsh" and "rcp" - as user oracle - from any node to any other node in the cluster. We just enabled this feature for all users except for root! - Run "rcoracle start" This sets the kernel parameters *before* we even start the installation! - Optional: install and configure xntpd to synchronize the time/date on all nodes. This keeps the system clock synchronized to within a tenth of a second on all nodes. - Edit /etc/profile.d/oracle[c]sh and set ORACLE_SID to some SID[#nodenumber] for each node. Make sure to edit BOTH files because there are Oracle scripts used during installation that use a C-shell, a Korn shell or a Bourne shell. - Install and configure PolyServe Matrix Server. Mount the cluster filesystems for the shared data and the shared Oracle home. In this document we use /opt/oracle for the shared Oracle home (ORACLE_BASE) and /var/opt/oracle for the database files. Change the owner of the directories to user "oracle", group "dba". Comment: Matrix Server includes a cluster filesysten and its own cluster management tools. For the purposes of the installation all we use in the following steps is the cluster filesystem. What the PolyServe product offers in addition to some unique features inside the cluster filesystem is a real cluster management component which offers some very powerful features. ... Note: The cluster filesystem for the Oracle cluster manager quorum file MUST be on a PolyServe filesystem that is mounted using the DB_OPTIMIZED option! NODE #1 (installation node) --------------------------- - Info: Unlike most other RAC installations where the Oracle home directory is shared, the entire installation can in our case be done completely on just one - on any! - of the nodes in the cluster! - As user "oracle" ./runInstaller to install cluster manager. For quorum file enter something like this: /var/opt/oracle/quorum (or wherever you mounted the cluster filesystem) Caution: For this shared Oracle home installation you must enter only ONE node in the Oracle cluster manager installation window! We install only on ONE node this time, and because of the cluster filesystem all other nodes share this installation! Info: A big advantage of this installation type is that we avoid all the complications of the Oracle installer copying part of the installed Oracle home to the other nodes, which in 9iR2 still has some bugs and sometimes even hangs at one or the other step (randomly). Exit the installer when you're done. If you select "Next install" to install the patchset (next point) right from there the installer will crash (known Oracle installer bug). - As oracle: ./runInstaller - change source to where you saved the 9.2.0.2 (or later) patchset and install the 920x patch for "Cluster Manager" Info: The installation of the patchset is HIGHLY recommended since beginning with 9.2.0.2 Oracle no longer uses "watchdogd" but a kernel-module (written by Oracle and included in SuSE kernels) called hangcheck-timer, which has many big advantages over the old "watchdogd"! So let's not even get started with the original cluster manager. - Edit /etc/sysconfig/oracle to enable start of OCM and GSD (GSD will work only later after the full software is installed) START_ORACLE_DB_OCM="yes" START_ORACLE_DB_GSD="yes" - Create the quorum file: dd if=/dev/zero of=/var/opt/oracle/quorum bs=1024k count=4 - Starts OCM, and hangcheck-timer (called "iofence-timer" by oracm) On the installation node only right now... rcoracle start If you didn't install our Oracle update kernel you will get an error about a missing module "iofence-timer"! - On each node: Check processes and $ORACLE_HOME/oracm.log/cm.log if oracm is up. Check /var/log/messages and cm.log if there are problems. The end of cm.log should look like this (here: 4 nodes): .... HandleUpdate(): SYNC(2) from node(0) completed {Thu Feb 13 18:20:19 2003 } HandleUpdate(): NODE(0) IS ACTIVE MEMBER OF CLUSTER {Thu Feb 13 18:20:19 2003 } NMEVENT_RECONFIG [00][00][00][00][00][00][00][0f] {Thu Feb 13 18:20:20 2003 } Successful reconfiguration, 1 active node(s) node 0 is the master, my node num is 0 (reconfig 1) NODE #1 (installation node) --------------------------- - As oracle: export SRVM_SHARED_CONFIG=/var/opt/oracle/SharedConfig - As oracle: Create the shared configuration file: dd if=/dev/zero of=/var/opt/oracle/quorum bs=1024k count=50 (on large clusters it needs to be much larger than the recommended value of 20 MB - the above command creates a 50 MB file) - As oracle: ./runInstaller The installer will detect the running Oracle Cluster Manager and through it all nodes that are part of the cluster, and show them to you. In this case with a shared Oracle home there should be only one node up if you followed the previous steps. Select "software only", i.e. no database creation (we want to upgrade to the latest Oracle 9.2.0.x patchset first) Exit the installer. - As oracle: in $ORACLE_BASE/oui/bin/linux/, do: ln -s libclntsh.so.9.0 libclntsh.so - As oracle: runInstaller As source select the 920x patchset directory (./stage/products.jar) Install 920x patchset (we already patched the Cluster Manager earlier) - Copy the installation node files /etc/oratab and /etc/oraInst.loc to the same location on all other nodes, making sure the owner remains the same, "oracle:dba" for oratab, "root" for oraInst.loc! - Stop everything currently running: rcoracle stop - As oracle: Create per-node cluster manager directories because the configuration and the logfiles for the cluster manager are per-node. We still keep it on the cluster filesystem because that has the advantage that we can keep it on this shared storage, independent from the nodes, which makes them easier to exchange and the configuration easier to administer. First check the permissions of $ORACLE_HOME/oracm/, if it is world writable do a "chmod -R o-w $ORACLE_HOME/oracm" (bug in Oracle installer). We use PolyServe's Context Dependent Symbolic Links (CDSL) feature to accomplish this. Those links are interpreted differently for each node so each node will see a different - its own - directory. For our 12 node cluster and nodenames conveniently consisting of "node##" the commands are: for n in 1 2 3 4 5 6 7 8 9 10 11 12; do cp -a oracm .oracm.node$i done rm -r oracm ln -s .oracm.{HOSTNAME} oracm Note: Make sure to use the correct $HOSTNAMEs, i.e. check this variable on all nodes! The directories must have the correct name "oracm.$HOSTNAME". We use "dot" files (.oracm.*) but this is a matter of personal preferences. ALL NODES --------- - Edit $ORACLE_HOME/oracm/admin/cmcfg.ora and add the other nodes to "PrivateNodeNames=..." and "PublicNodeNames=...". Set "HostName=..." to the individual hostname on each node! We use the same path on each node but because "oracm" is a PolyServe CDSL we end up in a per-host directory just like we wanted. - So that this time GSD is started too, which we only just installed. GSD is needed by OEM and by "dbca". As oracle: "srvctl -init" As root: "rcoracle start" - Go to $ORACLE_BASE and create a link to the shared NFS mounted directory cd $ORACLE_BASE ln -s /var/opt/oracle oradata (assuming you mounted the directory under /var/opt/oracle) - Go to $ORACLE_HOME and create a link to the shared NFS mounted directory cd $ORACLE_HOME rm -rf dbs ln -s /var/opt/oracle dbs (assuming you mounted the directory under /var/opt/oracle) FINISHED -------- Now the software is installed and ready, and the cluster manager and the GSD are up and running, we are ready to create a database! NODE #1 (installation node) --------------------------- - (as of orarun-1.8-8 this is done by "rcoracle start" when oracm is started) To make "dbca" happy on RAC, as root do: touch /etc/rac_on A dirty Oracle hack, see last 5 lines of $ORACLE_HOME/bin/dbca for what it does if you are interested. Or: edit "dbca" and you don't need this "touch" command. - (as of orarun-1.8-8 this is done by /etc/profile.d/oracle.sh - for user oracle only) To run gsdctl or gsd or srvconfig you have to do this in the same shell: unset JAVA_BINDIR JAVA_HOME Should "gsdctl" be unable to start GSD try $ORACLE_HOME/bin/gsd.sh. It is almost identical to $ORACLE_HOME/bin/gsdctl which is also a shell script... - As oracle: run "netca" to create an Oracle network configuration The Network Configuration Assistant should detect the running cluster manager and offer a cluster configuration option! You must at least configure a listener. You can accept all the defaults, i.e. simply press NEXT until the listener configuration is done. If netca does not work (bug: it does not write a listener configuration in "cluster" mode, see Metalink if there's a patch...) you have to use netmgr and/or edit manually. Please select only ONE node when running these tools (only netca has a cluster option) since we share the ORACLE_HOME among all nodes. - Run "lsnrctl start" on ALL NODES. - We highly recommend to create the database manually. "dbca" has various issues and limitations especially when trying to create a cluster database, and here again especially when trying to do this on a cluster filesystem - "dbca" expects raw devices and is not able to detect a cluster filesystem. Should you want to try, as oracle, run: dbca -datafileDestination $ORACLE_HOME/dbs Set up a database. Without the -datafileDestination parameter dbca assumes (and checks for!) raw devices which we don't use here! If there's an error right at the start, try restarting the cluster manager and GSD via "rcoracle stop; rcoracle start". - Edit /etc/sysconfig/oracle to start additional services, e.g. the Oracle listener. If you set START_ORACLE_DB="yes" you have to edit /etc/oratab (on ALL NODES) and change the last letter in the line for your database (usually just one line, at the bottom) to "Y" or no database will be started. URL: http://www.suse.com/oracle/ Contact: oracle@suse.com