Changes RSS

**This is an old revision of the document!** ----

A PCRE internal error occured. This might be caused by a faulty plugin

====== Bacula backup server on Debian Lenny, with remote SQL server ====== This node is a REALLY REALLY incomplete scratch-space for my bacula-related node... ===== What is Bacula? ===== First of all, if you are reading this, I hope you have at least a minimal knowledge of what Bacula is. As in, at leas you know that is is a system for backup, recovery and verification of computer data. Hopefully, you also know that it is a scalable, enterprise-ready solution, and you are prepared for that. As with everything else that gets labeled 'enterprise', and even 'scalable', Bacula is a system that is split into several parts, and is highly configurable. This gives great flexibility, at the cost of being rather complex to set up compared to smaller, simpler systems. If you are looking to back up your workstation, and only that, bacula is probably not for you. The same is probably true if you are looking at doing backups for a small set of computers; say two-to-four. On the other hand, if you are planning on doing backups for a greater number of systems, across operating systems, and/or require dependable backup volume control, bacula is probably very well suited. If you are coming from a commercial Enterprise backup solution, you may be surprised (hopefully pleasantly) to see that setup of Schedules, Clients, Jobs and the like are done in text-based configuration files, rather than a point-and-click GUI (or cryptic command line console). ===== Bacula components ===== As mentioned, Bacula is split into several parts. The following figure tries to show the central components, with the arrows describing direction of command- and data flow initiation. {{:guides:bacup:bacula-components.png?450|}} ==== Director ==== Central to a bacula installation is the Director. Simply put, the Director is the Bacula server itself, the central component that implement scheduled tasks, control running of backups and restores, and handles messages and reporting. ==== Console ==== Console applications exist in a variety of flavours for Bacula. Common for them is that they allow administrators to communicate with the Director (and other components via Director) to show status, list information, manipulate storage pools, run jobs et cetera. Different from many other Enterprise solutions, the management console is only used for managment, reporting and maintenance, and not for configuration. Configuration of Bacula components is done in configuration files, while the Console allows the administrator to operate and manage the dynamic environment that result from 'static' base configuration. So, as an example, the definition of a Client and a Job definition for the Client is done in configuration files, and the resulting data stored when the Job is run is managed using the Console. Another example may be that a Storage and a Pool gets defined in configurations, whereas manipulation of Volumes assigned to the Pool and Storage will be managed though the Console. ==== File Server ==== The Bacula File Server is known also as File Daemon and Client program. This is the software installed on the machine to be backed up. The File Server is responsible for recieving commands from the Director, and sending data for backup to a Storage, or recieving data from a restore. The File Server handles reading and writing data from/to the file systems on the machine, along with file and security attributes associated with the data. The File Server is not responsible for defining what data to be backed up. This is part of the configuration done at the Director. ==== Storage ==== As the name should imply, a Storage Server (or Daemon) handles storage of backup data on storage systems. A singe Storage Server may be used by multiple Directors, and a single Director may use multipe Storage Servers, allowing for a very flexible and scalable solution. Storage Servers use actual storage directly, either as file-based storage in the filesystems available at the host running it, or as storage devices available at the host. File, tape, WORM, FIFOs and CD/DVD-rom are supported as storage types, and a large variety of autochanger systems are supported, especially for Tape-based storage. A single Storage Server can serve multiple Storage devices/definitions to the Directors and File Servers communicating with it. Since Bacula communication with the Storage Server is done using TCP/IP, this component can exist on any host accessible to Director and File Servers, including the same host. ==== Catalog ==== The Catalog component is not a separate program in and of it self, but is a central concept. As with all large-scale backup solutions, a large amount of meta- and index-data gets generated by Bacula. Dynamic data generated by an operating Bacula system needs to be stored. Examples may be indexes of files for a backup that has been run, the state of the media pools. Bacula saves this information in the Catalog. The Catalog gets stored in a relational database, an SQL database. Three different storage backends are available: SQLite, MySQL and PostgreSQL. The database may be stored locally, on the same host as the Director, or it may be stored on a remote database server (mysql/pgsql). In this guid, a remote MySQL server will be used. ===== About this guide ===== The goal of this document, is to end up with a system consisting of: * One Director server * One Catalog for the Director stored in a remote MySQL database * One Storage Server running on the same host as the Director * Two storage Devices: * A File-based storage * An autochanger, using [[mhvtl]] * Three Clients * The localhost / director server * A remote Debian system * A remote Windows system * A separation of (at least) Client-related configuration into smaller files. And since I called this section "About this guide", this should be the location where I say that I take no responsibility whatsoever for any results you may experience if trying to implement a Bacula system when/after reading this. ===== Package installation on our Director server ===== The version of bacula in the standard repositories for Debian Lenny is **really** old, 2.4.4-1, compared to the latest stable, 5.0.3. I will be using Lennny Backports to get a more recent version, 5.0.2-2... I chose to go with MySQL in this setup. I would prefer going with pgsql, but as I wanted to focus on bacula, not database administration, I took the "easy way out", seeing that I am more experienced as a MySQL DBA... First two basic tools: <code> apt-get -y install \ libterm-readkey-perl \ psmisc </code> Next, we'll add Lenny Backports to our APT sources, to be able to get a more recent version of bacula. <code> echo -e "\n\ndeb http://backports.debian.org/debian-backports lenny-backports main" >> /etc/apt/sources.list apt-get update </code> I also want future updates to be pulled from backports, to get security fixes and the like. So I added the following to /etc/apt/preferences <code> Package: * Pin: release a=lenny-backports Pin-Priority: 200 </code> Since I do not want the MySQL server to be installed on the server running Bacula Director, and I want to avoid pulling in too many "unneeded" packages, the option "--no-install-recommends" is added to the following command. If you prefer to use aptitude, replace this option with "--without-recommends". Note that the lenny-backports version of the packages will be pulled in, thanks to the "-t lenny-backports" option. This option is identical for apt-get and aptitude. <code> apt-get \ -t lenny-backports \ --no-install-recommends \ install \ bacula-director-mysql \ bacula-console \ bacula-doc bacula-fd \ bacula-sd \ bacula-sd-mysql </code> I already have a fairly well performing, and maintained database server, and I do not like the concept of "a new DB server for each app", the setup will be using my already existing database server. Unfortunately, dbconfig-common does not support comnfiguration of remote SQL servers for bacula. So when debconfig asks this: <code> Configure database for bacula-director-mysql with dbconfig-common? </code> ... the answer is **No** This leads to a different problem: The installation "fails", because the post-install script for bacula-director-mysql fails. This (should) be solved by doing a bit of configuration, and then coming back later to fix this with "apt-get -f install"((I have been notified that the database setup even fails if you are installing mysql-server as a dependency for local database use... This is because the mysql-server is not yet running when debconf tries to use dbconfig-common...)). ==== Database seeding ==== Before we can start setting up bacula, we need to set up our database. Because I chose to use a remote SQL server, and dbconfig-common is braindead and does not understand that concept, the database will have to be created and seeded with tables manually. This is strictly speaking a part of Configuration, but also so important for the elementary setup, that I will call it a part of installation. Unfortunately, the packages ships only with a shell-script to seed the tables, assuming that the database will be installed locally. Fortunately, the script is basically an SQL script wrapped with shell-commands. So, lets use that! <code> cp /usr/share/bacula-director/make_mysql_tables ~/bacula_init.sql vim ~/bacula_init.sql </code> Remove the top lines leading up to, but not including the line: <code> USE ${db_name}; </code> On that line, replace '${db_name}' with the actual database name that you'll be using. If you, as I, prefer to use InnoDB rather than MyISAM, add the following as the absolute top line of the file: <code> SET storage_engine=INNODB; </code> Next, go to the bottom of the file, and remove everything from (including) and below the line: <code> END-OF-DATA </code> The lines you remove should be the mentioned one, plus some "then - echo - else -echo - fi - exit" ... So, now that we have a SQL script to use, create the actual database on the database server, and grant fairly open permissions on the database to a user created for bacula. The following is not good practice, but it will get the job done. If you want more precise control, please do so when adding the grant, but also remember that you can easily modify the grant later on.... <code> ssh databaseserver mysql -u root -p mysql </code> <code> create database bacula_db; grant all privileges on bacula_db.* to 'bacula'@'backupserver' identified by 'password'; </code> Now that the database is created, and a user for bacula is created and granted permissions to use the database, it is time to fill the database. Get back to our bacula server, and load the SQL script. <code> mysql -h databaseserver -u bacula -p < ~/bacula_init.sql </code> ===== Configuration ===== Since Bacula is separated into different components that can live completely separate, configuration of these components are split into respecitve configuration files. Needless to say, these configuration files will relate to each other, enabling communication between the components. Here is an attempt at visualizing the relations: {{:guides:bacup:bacula-config-relations.png?500|}} The two most central configuration files, in my "backupserver"-oriented view, is the Storage Daemon config, bacula-sd.conf, and the Director conf, bacula-dir.conf. I started by getting to know the bacula-dir.conf file, and then started working with the configuration by setting up my Storage Daemon, so that I had my storage devices available. Before we dive into the configuration of Bacula, we should get an overview on what the Director configuration file contains, how it is sectioned, and how the sections relate to each other (and the surrounding world). {{:guides:bacup:bacula-dir-sections.png?500|}} I won't describe the sections in text here, so take some time examining the above figures until you feel you have a grasp on how the files and sections are related. :!: **Note** In the following configuration, note the following: * The hostname of the server hosting my Director and Storage Daemon is __bactank__ * The hostname of my database server is simply __database__ * The hostname of my Linux client is __linuxclient__ * The hostname of my Windows client is __windows__ * I will be using one FileSet for each client * On the host __bactank__ I will be excluding /opt completely, and store VTL files and File-based "volumes" under that directory. ==== Storage Daemon ==== My goals for the Storage Daemon is, as stated, to run it on the same host as the Director, and to provide two types of Storage though it: a File based storage, and a mhvtl autochanger/virtual tape library. Before progressing, you may want to take a quick look at my [[mhvtl]] description/guide, to get familiar with how the virtual library is represented as SCSI devices. A bit of work has been done for us by the Debian packages, so the configuration file for the SD is already prepared for communication from the Director. Most importantly, this means that proper password relationships have been set up. But, in my opinion, a lot of unneeded stuff is in there as well. I started with the debian-package-file, and stripped away all that I did not want, and added what I needed. The configuration of the Storage server is ''/etc/bacula/bacula-sd.conf''. It should start with defining the properties of the Storage Daemon it self: <code> Storage { Name = bactank-sd; SDPort = 9103; WorkingDirectory = "/var/lib/bacula"; Pid Directory = "/var/run/bacula"; Maximum Concurrent Jobs = 20; } </code> Here we assign the Storage Daemon a name, and tell it to listen on any interface/address, port 9103. We tell it to use /var/lib/bacula as a scratch/workspace, and finally that we do not want more than 20 concurrent jobs using this Daemon. Next, we need to set up a definition to allow the Director controlling the Storage: <code> Director { Name = bactank-dir; Password = "random-generated-password-identical-to-director-conf"; } </code> The Name needs to be identical to the Name that we will assign to our Director instance, and it gets auto-filled by the debian-packages as ''hostname-dir''. The Password will be auto-generated by the Debian-packages, and needs to be identical to the ''Password ='' statement in the Storage section of the Director config. The Debian configuration will also include a Director section for monitoring. Leave this in, I will not comment that part further, than saying that more than one system may control a Storage Daemon, though configured Director sections. I wanted a File-based backup resource. I will not really use this anywhere, but I an including it to show how to set one up. <code> Device { Name = FileStorage; Media Type = File; Archive Device = /opt/bacula-filestore; LabelMedia = yes; Random Access = Yes; AutomaticMount = yes; RemovableMedia = no; AlwaysOpen = no; } </code> No files will be created at this location before bacula actually uses this resource to create a volume and stores data to it. Also, according to my understanding, a new file (volume) will be created for each Job((I have not yet tested File-based storage, so I will probably come back and update this.)). I do not specify any sizes, allowing auto-labelling and Volume Management create and close the files as it sees needed. Next up, I add the four tape drives presented by [[mhvtl]]. I will simply list one of them, as the rest are identical with the exception of the Name and Device: <code> Device { Name = Drive-1; # Will be referenced as device name by Autochanger later. Drive Index = 0 # Index as reported by the changer, and as used by bacula Media Type = LTO-4; # Description of type of media. Archive Device = /dev/nst0; # Non-rewinding SCSI device AutomaticMount = yes; # when device opened, read it AlwaysOpen = yes; # Keep the device open until explicitly unmounted/released RemovableMedia = yes; # Well, duh ;) RandomAccess = no; # Tapes are by nature sequential AutoChanger = yes; # This device is part of an autochanger Hardware End of Medium = No; # Needed for proper operation on mhvtl Fast Forward Space File = No; # Needed for proper operation on mhvtl # Heed the warnings in the distribution file about tapeinfo and smartctl. Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'" } </code> :!: Note that the Media Type is descriptive, not technology based. Also note that if you have multiple changers with the same media, that do not share the media, you will need to make this different between the changers, or else Bacula may try to load a tape that belongs to one changer, into the other... Adding a pre- or postfix to the Media Type will make them "different" in the Media index, makin sure this does not happen. Since my "autochanger" has four drives, this needs to be repeated for all four of them. When that is done, we get to the Autochanger itself: <code> Autochanger { Name = MHVTL; Changer Device = /dev/sg4; Device = Drive-1; Device = Drive-2; Device = Drive-3; Device = Drive-4; Changer Command = "/etc/bacula/scripts/mtx-changer %c %o %S %a %d"; } </code> Simple enough, this gives the autochanger a name that can be referenced by Director, what "physical" device this is, that Device definitions make up the attached drives of the changer, and finally the command to run for controlling it. Finally, we define that all messages generated by the Storage Daemon should be sent to the Director for processing/filtering/delivery: <code> Messages { Name = Standard; director = bactank-dir = all; } </code> ==== File Daemon for localhost ==== To be able to flesh out as much as possible of the Director config, I first want to have the File Server/Daemon used to make backups of the backupserver itself defined before starting the Director. So, let's attack ''/etc/bacula/bacula-fd.conf'' on ''bactank''. Here I start out with the definition of the Directors that is permitted to control the client. Just like in the ''bacula-sd.conf'' file, there is a Montior definition you can leave in, and focus on the actual Director instead: <code> Director { Name = bactank-dir Password = "random-generated-password-identical-to-director-conf" } </code> Again, the Name statement needs to match the Name given to the Director, and it still defaults to ''hostname-dir''. The password will be auto-generated by the Debian packages, and needs to match the Password definition in a relevant Client section of the Director configuration. Next is the definition of the FileDaemon itself: <code> FileDaemon { Name = bactank.example.com; # Name used in Director client config FDport = 9102; # where we listen for the director #FDAddress = 127.0.0.1; # If you want to close down to a single address. WorkingDirectory = /var/lib/bacula; # Where to "scratch and temp" Pid Directory = /var/run/bacula; Maximum Concurrent Jobs = 20; } </code> The Name can be anything you want, but must be identical in the Directors Client definition, and it will be used for associating data generated from this File Server with metadata in the Catalog. The default will be ''hostname-fd'', but I prefer more verbose naming. Note that the name must differ from the Name statement of your Storage Daemon and your Director, when these run on the same host. I have commented out the FDAddress statement, telling the FD to listen on any interface. This may be against your security policy, so feel free to lock it down to either the loopback address or a specific IP on the host. As with the Storage Daemon, we close off ''bacula-fd.conf'' with a definition of where to send Messages: <code> Messages { Name = Standard director = bactank-dir = all, !skipped, !restored } </code> This is a little more precise than that of the Storage. Here we say that all messages, except thise related to skipped files and restored files, should be sent to the director ''bactank-dir''. In this context, //skipped// means files not included in the backup because it was configured to be excluded, or skipped because they were not changed when doin incremental or differential backup. ==== Director ==== Now we come to the longest configuration file yet, the Director configuration, ''/etc/bacula/bacula-dir.conf''. I could not understand the organization oth the Debian-packaged version, so I have re-organized the file to better reflect the relations of the sections to each other. === Director itself === We start off with the natural top-most section, the definition of the Director itself. As you should have noticed in the above section, consistent Name for the Director is important, it is used not only to identify the Director, but also as a part of the Authentication-Authorization in communication between components. <code> Director { Name = bactank-dir; DIRport = 9101; # where we listen for UA connections # DirAddress = xxx.yyy.zzzz.www; # IP address to listen on, if needed QueryFile = "/etc/bacula/scripts/query.sql"; WorkingDirectory = "/var/lib/bacula"; PidDirectory = "/var/run/bacula"; Maximum Concurrent Jobs = 1; # Console password Password = "random-generated-password-used-by-console-connections"; Messages = Daemon; } </code> If you want the Director to only be available for Console applications on a given IP address, or even only from ''localhost'', use DIRAddress to lock this down. The Messages directive references a Name given to a Messages section in the same file. We'll get back to this one, but note that this differs from how we wrote Messages sections in the other files. in the other files, the Messages section typically describes what director should recieve what messages. On the Director, we'll be using Messages sections to actually do something with those messages. === A read-only console for monitoring === Console { Name = bactank-mon Password = "shared-secret-password-used-in-console-config" CommandACL = status, .status; # Allow only status-reading. } === Catalog === The Catalog is so central to the Director, that I put this section next: <code> Catalog { Name = StandardCatalog; # One director may have multiple Catalogs. DB Address = database.example.com; # What server to use, and DB Port = 3306; # What port to connect to. dbname = bacula; # The name of the SQL database to use user = bacula; # The username used when connecting password = "db_password"; # The password of the database user } </code> Hopefully that was relatively self-explanatory. The most common setup, is to use a single Catalog with a single Director. If your setup is LARGE, or you thing you need separate Catalog instanced for some other reason, please reference the official documentation. But before you go: it really is as simple as defining more blocks like the one above. === Messages === Messages are a fairly "used-by-all" element, so I put two sections defining two different behaviours next. First, the message delivery for the Daemon/Director , then the Standard resource that will be used for all other Messages: <code> Messages { Name = Daemon; mailcommand = "/usr/lib/bacula/bsmtp -h localhost -f \"\(Bacula\) \<bacula@bactank.example.com\>\" -s \"Bacula daemon message\" %r" mail = operator@example.com = all, !skipped console = all, !skipped, !saved append = "/var/lib/bacula/log" = all, !skipped } Messages { Name = Standard mailcommand = "/usr/lib/bacula/bsmtp -h localhost -f \"\(Bacula\) \<bacula@bactank.example.com\>\" -s \"Bacula: %t %e of %c %l\" %r" operatorcommand = "/usr/lib/bacula/bsmtp -h localhost -f \"\(Bacula\) \<bacula@bactank.example.com\>\" -s \"Bacula: Intervention needed for %j\" %r" mail = operator@example.com = all, !skipped operator = operator@example.com = mount console = all, !skipped, !saved append = "/var/lib/bacula/log" = all, !skipped } </code> Things to note in the above, when compared to the default configuration: * I specify a different __from-address__ than the recipient * I have changed the recipient to a more sane value * I prefer to have the bacula Daemon and Standard log in separate files. Other than that, the above is fairly stock. Make sure you read the rationale for replacing the from-address in the "NOTE!" block of the default configuration. === Storage === Our Storage servers and devices should follow next. <code> # Definition of file storage device Storage { Name = File; # Name to use when referencing this storage in -dir.conf Address = bactank; # N.B. Use a fully qualified name here SDPort = 9103; # Port to listen on. Use SDAddress if you want specific listen. Password = "random-generated-password-identical-to-sd-conf"; Device = FileStorage;# must be same as Device in Storage daemon Media Type = File; } # Definition of mhvtl autochanger Storage { Name = MHVTL; # Name to use when referencing this storage in -dir.conf Address = bactank; # N.B. Use a fully qualified name here SDPort = 9103; # Port to listen on. Use SDAddress if you want specific listen. Password = "random-generated-password-identical-to-sd-conf"; Device = MHVTL; # must be same as Device in Storage daemon Media Type = LTO-4; # must be same as MediaType in Storage daemon Autochanger = yes; # enable for autochanger device } </code> The comments say "N.B. Use a fully qualified name here", but in reality, anything that ends up with the IP that the Storage Daemon listens on can be used. Observe that the name used here, will be reported to File Servers, and they will use it to communicate with the Storage Daemon. So all Clients will also need to be able to get an IP from what you enter here:!: Note that the Storage definition for the autochanger references the Autochanger as Device, not the individual tape drives. It is the responsibility of the Storage Daemon to represent the changer correctly. Also, remember the notes about Media Type when we configured the Storage Daemon. === Pools === A Pool is a collection of media/volumes, and it is natural to define these once we have the Storages defined. Bacula supports a quite "magical" pool; If a pool exists with the name Scratch, Empty and Recyclable volumes of the correct Media Type for a given Job present in this pool will be automatically moved to a Pool that needs additional tapes when Jobs are run. This means we can start by adding all our volumes to the Scratch Pool, and these tapes will be allocated as needed. By also adding the directive "RecyclePool = Scratch", volumes will be returned to this pool as soon as they get marked as Recyclable and subsequently Purged. <code> # Default pool definition Pool { Name = Default; # The name to reference this Pool Storage = MHVTL; # A pool uses a singe Storage. Pool Type = Backup; # Currently supported: Backup.. Recycle = yes; # Bacula can automatically recycle Volumes AutoPrune = yes # Prune expired volumes Volume Retention = 4 months # 1/3 year File Retention = 1 months Job Retention = 2 months RecyclePool = Scratch # Move to this pool when markec Recyclable Cleaning Prefix = CLN # If cleaning tapes are available, they have this pfx. } Pool { Name = Monthly; # The name to reference this Pool Storage = MHVTL; # A pool uses a singe Storage. Pool Type = Backup; # Currently supported: Backup.. Recycle = yes; # Bacula can automatically recycle Volumes AutoPrune = yes # Prune expired volumes Volume Retention = 24 months # 2 years File Retention = 9 months Job Retention = 12 months RecyclePool = Scratch # Move to this pool when markec Recyclable Cleaning Prefix = CLN # If cleaning tapes are available, they have this pfx. } # Scratch pool definition Pool { Name = Scratch Storage = MHVTL RecyclePool = Scratch Pool Type = Backup } </code> Volume Retention needs to be set fairly high, at least higher than any File or Job retention. The retention periods define how long data about a given Volume/Job/File is to be kept in the Catalog, and as such, how much time will pass before a Volume Expires... Please look at the [[http://www.bacula.org/5.0.x-manuals/en/main/main/Catalog_Maintenance.html#SECTION004510000000000000000|Setting Retention Periods]] section of the Bacula manual for an explanation. === Schedules === I am a fan of using as few different Schedules in a backup solution as possible. Thus I define the absolutely needed Schedules as early on as possible aswell. <code> # When to do the backups # Do a full dump every sunday # Take a Differential (what changed since last Full) on Wednesdays # Take increments (what changed since last backup) the rest of the week. Schedule { Name = "WeeklyCycle" Run = Level=Full sun at 2:05 Run = Level=Differential wed at 2:05 Run = Level=Incremental mon-tue at 2:05 Run = Level=Incremental thu-sat at 2:05 } # In the Monthly Cycle the Pool gets overrided to use the Pool with a # much longer Volume retention period. # Every first sunday of the month, make a full backup, then do # Differential backup on Sunday the rest of the month. Schedule { Name = "MonthlyCycle" Run = Level=Full Pool=Monthly 1st sun at 3:05 Run = Level=Differential Pool=Monthly 2nd-5th sun at 3:05 } </code> === Job Defaults === Creating atleast one JobDefs section, providing generic defaults for Job definitions, makes writing later Job easier, as they then only need to contain the settings specific to that Job, and not repeat a whole lot of config over and over again. If any directive given in a JobDefs section is also given in the Job section, the definition in the Job section naturally replaces the default. <code> JobDefs { Name = "DefaultJob" Type = Backup FileSet = "Full Set" Schedule = "WeeklyCycle" Messages = Standard Pool = Default Priority = 10 Rerun Failed Levels = yes } </code> This sets up a "template", where any job using this definition will get: * A job type of Backup (other types are: Resore, Verify, Admin) * Uses the "Full Set" FileSet definition (will be described later) * Runs according to the "WeeklySchedule" described above. * Uses the Standard message-handler. * Uses media from the pool named Default. * Has a Priority of 10 (Higher values means jobs will run later) * Will Upgrade the next job to the Type of a job that prevoiusly failed. === FileSet === I add generic FileSets in the common configuration, before Clients and Jobs, simply because they are meant to be just that: generic. If I need to specify what and how to do the files of a given Client or Job more precicely, I put that specific definition along with the Client and/or Job definition. Here, I'll list my sort-of generic Unix-related FileSet that will try it's best at doing a backup of a complete Unix filesystem, as long as the Client using this in a Job has all files in a single partition. <code> FileSet { Name = "Full Set" # Note: / backs up everything on the root partition. # if you have other partitons such as /usr or /home # you will probably want to add them too. Include { Options { signature = MD5 } File = / File = /boot } Exclude { File = /proc File = /tmp File = /sys File = /dev File = /.journal File = /.fsck # I typically define /var/lib/bacula as the # WorkingDirectory for the File Daemon. File = /var/lib/bacula # Excluding files/directories that do not exist # has no effect other than making the FileSet generic.. File = /opt/vtl File = /opt/bacula-filestore } } </code> FileSets can be buildt VERY complex, this is an attempt at a fairly manageable base-definition. For more details, and more complex examples, look at the relatively long sections in the [[http://www.bacula.org/5.0.x-manuals/en/main/main/Configuring_Director.html#SECTION001870000000000000000|Bacula Director manual]] === Client and Job in bacula-dir.conf === Now we actually have a very-close-to useful configuration. The only parts missing are Clients and Jobs. If you remember from far up in the document, One Job may only reference One Client, so even though there is a One-to-Many relationship between Clients and Jobs (one Client may have multiple Jobs associated) each Job is directly tied to one, and only one, Client. This means that grouping Client definitions and Job definitions together is very natural. So natural in fact, that I'll use file-inclusion features in Bacula configrations to create a separate configuration file for each client. But, there is one Client, with its associated Jobs, that is natural to include in the bacula-dir.conf file, and that is the Client definition for the bacula server, or localhost if you wish. Remember from the bacula-fd.conf that we set up, I used the Name ''bactank.example.com'', and used the random-generated Password directive. Using those two strings: <code> # Client (File Services) to backup Client { Name = bactank.example.com; Address = bactank; FDPort = 9102; Catalog = StandardCatalog; Password = "random-generated-password-identical-to-fd-conf"; AutoPrune = yes; # Prune expired Jobs/Files } </code> Now, we'll add the absoutely basic Job for this Client: a full backup. <code> Job { Name = "bactank.example.com Default" JobDefs = "DefaultJob"; Client = bactank.example.com; Write Bootstrap = "/var/lib/bacula/bactank.bsr"; } </code> As this uses the DefaultJob JobDef/template, this will use the ''Full Set'' standard for Unixes as its FileSet, it will run using the ''WeeklySchedule'', backing up data to the ''Default'' pool, and reporting messages to the ''Standard'' message facility. FIXME Needs a FileSet and Job definition for dumping the Catalog, and backing that up. === One common Job === We'll add one Job that gets tied to the first Client, but will in actuallity be modified in the Console applications when it gets run. This Job definition is the ''RestoreFiles'' job. Because all jobs must be defined before they can be used, and all Jobs must be tied to a Client, this gets put in the global config. But, this Job will (should) never be run as-is. This job exists as a template to be used when starting a restore job. In the console, You'll modify each and every spec in this Job, to match the Restore that will be done. <code> Job { Name = "RestoreFiles" Pool = Default Type = Restore Client=bactank.example.com FileSet="Full Set" Messages = Standard Where = /nonexistant/path/to/file/archive/dir/bacula-restores } </code> ==== Console ==== There is one more configuration file to take a look at, before we are done with the basic configuration, and that is the configuration of the Bacula Console locally on the Director server. This file is named ''/etc/bacula/bconsole.conf'' and is very simple: <code> Director { Name = localhost-dir; DIRport = 9101; address = localhost; Password = "random-generated-password-from-director-section-in-dir-conf"; } </code> ===== Starting the components ===== What order you start the components in should be absolutely irrelevant, because the individual Components will not try to communicate before they need to. E.g. the Director will not contact the SD before a Storage operation needs to be done, or a FD before a Job needs to communicate with the Client. But to be on the safe side: <code> /etc/init.d/bacula-sd start /etc/init.d/bacula-fd start /etc/init.d/bacula-director start </code> ===== Tapes/volumes in the Media database / pools from an autochanger ===== To get some Tape Volumes to work with, we start by doing a load-and read query to the autochanger, to initialize and inventory the changer. Start the Bacula console on the Director server host <code> bconsole </code> Run the update/inventory: <code> update slots storage=MHVTL drive=0 scan </code> The ''update slots'' command will output a whole bunch of "Read error" errormessages. This is normal, the VTL simulates an unlabeled/uninitialized tape, and that is what we want when running this command against a fresh VTL. So, now that we are sure that we have a correctly initialized library with a fresh inventory, To be sure everything is fresh, and note the slots that we want to add to our Media index, the next step becomes: <code> status slots storage=MHVTL drive=0 </code> As long as none of the Slots come up with an assigned Pool (or status/media for that matter), we can safely Label the Volumes. To automatically Label Volumes using their "barcode", use: <code> label storage=MHVTL drive=0 pool=Scratch slots=1-22 barcodes </code> Notice that this adds the Volumes to the Scratch pool, where they will reside until they are needed. ===== Run your first job ===== Start bconsole (if you exited it earlier) <code> bconsole </code> Do the absolutely simplest ''run'' command possible: <code> run </code> This will show you a list of availabe Job Resources to run. If you haven't added anything beyond my example here, you'll get something like: <code> A job name must be specified. The defined Job resources are: 1: bactank.example.com Default 2: RestoreFiles Select Job resource (1-2): 1 </code> I selected the first resource, as I want to run the first backup. Doin a restore at this point makes no sense ;) Now, this will list the settings for the job. I'll simply show how I modified the setings from an Incremental to a Full job: <code> Run Backup job JobName: bactank.example.com Default Level: Incremental Client: bactank.example.com FileSet: Full Set Pool: Default (From Job resource) Storage: MHVTL (From Pool resource) When: 2010-10-31 23:53:48 Priority: 10 OK to run? (yes/mod/no): mod Parameters to modify: 1: Level 2: Storage 3: Job 4: FileSet 5: Client 6: When 7: Priority 8: Pool 9: Plugin Options Select parameter to modify (1-9): 1 Levels: 1: Full 2: Incremental 3: Differential 4: Since 5: VirtualFull Select level (1-5): 1 Run Backup job JobName: bactank.example.com Default Level: Full Client: bactank.example.com FileSet: Full Set Pool: Default (From Job resource) Storage: MHVTL (From Pool resource) When: 2010-10-31 23:53:48 Priority: 10 OK to run? (yes/mod/no): yes Job queued. JobId=1 </code> We can now see that it is indeed running: <code> list jobs </code> <code> +-------+--------------------------------+---------------------+------+-------+----------+----------+-----------+ | JobId | Name | StartTime | Type | Level | JobFiles | JobBytes | JobStatus | +-------+--------------------------------+---------------------+------+-------+----------+----------+-----------+ | 1 | bactank.example.com Default | 2010-10-31 23:58:22 | B | F | 0 | 0 | R | +-------+--------------------------------+---------------------+------+-------+----------+----------+-----------+ </code> In this first dump, I simply assumed that communication worked, and that I would have enough storage space for the backup. In a more proper scenario, you should have used the command ''estimate'': <code> estimate job="bactank.example.com Default" </code> <code> Using Catalog "StandardCatalog" Connecting to Client bactank.example.com at bactank:9102 2000 OK estimate files=41,763 bytes=855,995,113 </code> I knew that I had set MHVTL up to use tape-files of 15GB size, and that I had 390GB available on the LVM-volume where this gets stored, so handling ~850MB of data would be no problem. I also added Compression to the mhvtl setup, so the resulting storage use was: <code> bactank:~# ls -lh /opt/vtl/TAPE01L4 -rw-rw---- 1 vtl vtl 512M 2010-11-01 00:01 /opt/vtl/TAPE01L4 </code> So, 850MB got compressed down to 512MB. I could push this way further down, I have used compression level 1 of 9 in my MHVTL configuration. ===== Adding a remote linux client ===== Configuration and control of bacula clients / File Servers are done at the Director. But before we configure the Director, we'll start by installing the software. ==== On the client ==== What we need to do at the client, is installing and configuring the Bacula File Daemon. On a Debian system, we pull the package in using apt-get((Notice that I skipped setting up lenny-backports. A higher-level director is compatible with an older file daemon)). <code> sudo apt-get -y install bacula-fd </code> Next, set up the ''/etc/bacula/bacula-fd.conf'' file on the Client host: <code> Director { Name = bactank-dir Password = "random-password-to-use-in-director-client-def" } # "Global" File daemon configuration specifications FileDaemon { # this is me Name = linuxclient.example.com; FDport = 9102 # where we listen for the director WorkingDirectory = /var/lib/bacula Pid Directory = /var/run/bacula Maximum Concurrent Jobs = 20 } # Send all messages except skipped files back to Director Messages { Name = Standard Director = bactank-dir = all, !skipped, !restored } </code> Remember to update the Director = references to the name of your Director! ==== On the Director ==== Now, we will start using a wee bit of Bacula configuration file magic! We want to keep the main Director configuration as clean as possible. This may in part be achieved by splitting individual client definitions (Client, Job and possibly FileSet) out into separate perr-client files. Bacula configurations support inclusion of external files, and even inclusion of configuration generated by commands! Since Bacula 2.2.0 you can include the output of a command within a configuration file with the ”@|” syntax. We use this to create a "dot-dee" directory for client configurations. Added to the bottom of the ''/etc/bacula/bacula-dir.conf'' on the Director server: <code> # Include subfiles associated with configuration of clients. # They define the bulk of the Clients, Jobs, and FileSets. # Remember to "reload" the Director after adding a client file. # @|"sh -c 'for f in /etc/bacula/clients.d/*.conf ; do echo @${f} ; done'" </code> The reason we have not added this yet, is because loading the configuration will fail if the directory is empty or non-existant. Before we reload the configuration, we want to add the client in. I prefer to use client-hostname based file-names, so I create the file \\ ''/etc/bacula/clients.d/linuxclient.example.com.conf'': <code> Client { Name = linuxclient.example.com; Address = linuxclient.example.com; FDPort = 9102; Catalog = StandardCatalog; Password = "random-password-to-use-in-director-client-def"; AutoPrune = yes; # Prune expired Jobs/Files } # The full backup for this client. Job { Name = "linuxclient.example.com Default" JobDefs = "DefaultJob"; Client = linuxclient.example.com; Write Bootstrap = "/var/lib/bacula/bactank.bsr"; } </code> More or less identical to the first Client, the "localhost" definition... With that configuration bit in place, we are ready to load up the configuration, and use it. On a bacula console controlling the director: <code> reload list clients </code> <code> Automatically selected Catalog: StandardCatalog Using Catalog "StandardCatalog" +----------+------------------------+---------------+--------------+ | ClientId | Name | FileRetention | JobRetention | +----------+------------------------+---------------+--------------+ | 1 | bactank.example.com | 5,184,000 | 15,552,000 | | 2 | web.example.com | 0 | 0 | +----------+------------------------+---------------+--------------+ </code> ===== Adding a remote windows client ===== FIXME Missing. ===== Links, references, scratch ===== * http://www.bacula.org/en/?page=downloads * http://bacula.org/5.0.x-manuals/en/main/main/Bacula_Main_Reference.html * http://www.backupcentral.com/phpBB2/two-way-mirrors-of-external-mailing-lists-3/bacula-25/bacula-5-0-3-backport-for-debian-lenny-107920/ * http://www.crazysquirrel.com/computing/debian/backup/bacula-on-debian.jspx * http://panyasan.wordpress.com/2008/03/02/using-bacula-for-a-distributed-backup-system-debian-etch/ * http://edin.no-ip.com/content/bacula-debian-sid-mini-howto * http://www.bacula.org/manuals/en/catalog/catalog/Installi_Configur_PostgreS.html * http://wiki.bacula.org/doku.php?id=sample_configs * http://www.bacula.org/5.0.x-manuals/en/main/main/Configuring_Director.html * http://lucasmanual.com/mywiki/Bacula * http://www.backupcentral.com/phpBB2/two-way-mirrors-of-external-mailing-lists-3/bacula-25/bacula-is-not-recycling-pruning-purging-automatically-95208/ * http://www.bacula.org/3.0.x-manuals/en/console/console/Bacula_Console.html * http://sites.google.com/site/linuxvtl2/ * http://backports.debian.org/Instructions/