Metadata-Version: 2.1
Name: bulk_restore_tool
Version: 1.0.1
Summary: Code42 Bulk Restore Tool
Project-URL: Documentation, 
Project-URL: Issues, 
Project-URL: Source, 
Author-email: Code42 Software <integrations@code42.com>
License-Expression: MIT
Classifier: Development Status :: 4 - Beta
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Requires-Python: >=3.7
Requires-Dist: aiofiles
Requires-Dist: click
Requires-Dist: httpx
Requires-Dist: pydantic[dotenv]
Requires-Dist: rich
Requires-Dist: sqlite-utils
Description-Content-Type: text/markdown

# Bulk Restore Tool (BRT)


The Bulk Restore Tool is a python library and command-line utility that enables efficient restoring of the contents of
a Code42 preservation archive without the need to install a Code42 agent on the target machine.

The tool is published to an AWS S3 bucket served by a Cloudfront distribution @ https://pypi.us.code42.com

It can be installed by running:

```bash
python3 -m pip install --extra-index-url https://pypi.us.code42.com bulk-restore-tool
```

# Performing Bulk Restores

## Defining a job

The bulk restore tool creates and runs "restore jobs". A job is a collection of archives to restore from and the file
selection criteria applied to those archives.

Each job has a name associated with it (if not provided by the user, the tool will generate a new UUID as the "name"
automatically), this name will show in Code42 Audit log entries for each archive restore session created.

At bare minimum, a job requires a list of archiveIds to restore, and the default file selection will apply:

Minimal job as a .json file input:
```json
{
  "archives": [
    {"archiveId":  "1234"}
  ]
}
```

Minimal job created in Python:
```python
from bulk_restore_tool import JobDefinition

job = JobDefinition(archives=[{"archiveId":  "1234"}])
```

The optional parameters for a job are as follows:

- **jobId**: `str` Identifier for job (will show in Audit Log entries). If not provided a new UUID will be generated.
- **type**: `Optional[str]` Optional indicator of how the archives in the job were selected, i.e. by archive, device, username, or legal hold ID.
- **identifier**: `Optional[str]` Optional indicator of what identifiers were provided of the given `type`.
- **selection**: `FileSelection` Selection criteria for what files to restore from the archive.
- **targetFolder**: `Path` Directory where the job metadata and restored files should be saved to (defaults to current working directory).
- **zipResults**: `bool` Indicates if the restored files should be compressed to a zip archive per device in the job.

Using the `brt` command-line tool, you can also create jobs easily in the terminal. The `brt create-job` command accepts
the following types of identifiers to build the list of archives automatically for you: 

- archive
- device
- username
- legalHold

For example, to restore all the files for user `john@example.com`, who has 3 Code42 devices, each backing up to dual
destinations (so 2 archives per device), run the following:

```bash
brt create-job --type username john@example.com
```

A job definition .json file will be created and populated with the default file selection and include all the archives
owned by `john@example.com`.

## Running a job

Once a Job Definition has been created, you can run the job either from the command-line (requires a .json file of the
job definition):

```bash
brt start-job restore_<job_name>.json
```

Or within a Python script with the `.start()` method of the `RestoreJob` class:

```python
from bulk_restore_tool import JobDefinition, RestoreJob

definition = JobDefinition(
    jobId="my_job",
    archives=[{"archiveId": "1234"}],
    selection={"includeDeleted": True},
)
job = RestoreJob(definition=definition)

# NOTE: if no credential args are passed to the `RestoreJob` constructor, credentials will be attempted to be read
# from the shell environment variables:
# - BRT_URL
# - BRT_USERNAME
# - BRT_PASSWORD
# and for API client auth:
# - BRT_API_CLIENT_ID
# - BRT_API_CLIENT_SECRET
#
# Otherwise construct the `RestoreJob` class with the appropriate credential args:
# job = RestoreJob(definition=definition, url="<url>", api_client_id="<api_client>", api_client_secret="<secret>")

job.start()
```

## Job Metadata

Before beginning a restore, a job needs to be prepared by fetching the archive metadata (which it writes to .json files
for each archive in the target directory), and processing that metadata for the selection criteria.

For each device, a sqlite database will be created and all file records from any archives that device owns will be stored
in the database's `file` table. The bulk restore tool then applies the file selection filters, setting the `file.status`
column to either "SELECTED" or "EXCLUDED" (the default file selection is to include everything).

Metadata is automatically prepared when running the `brt create-job` command, unless the `--no-calculate` option is
provided. When creating a job from a Python script, `.start()` will prepare the metadata automatically if metadata files
don't yet exist in the target directory. But you can fetch metadata without starting the job by calling the 
`RestoreJob.prepare()` method.


## Restore Client

The `bulk_restore_tool` package also exposes a helper client for making some restore-related API calls directly.

```python
from bulk_restore_tool import Client, JobDefinition
from bulk_restore_tool.models import Archive

# NOTE: if no credential args are passed to the `Client` constructor, credentials will be attempted to be read
# from the shell environment variables
client = Client(url="<url>", api_client_id="<api_client>", api_client_secret="<secret>")

# get lists of archives (that can be used directly in a `JobDefinition`:
user_archives = client.get_archive_details(id_type="username", identifier="user@example.com")
device_archives = client.get_archive_details(id_type="device", identifier="<device_guid>")
legal_hold_archives = client.get_archive_details(id_type="legalHold", identifier="<legal_hold_Uid>")
all_archives = user_archives + device_archives + legal_hold_archives
definition = JobDefinition(archives=all_archives)


# get archive metadata for a single archive:
user_archive: Archive = user_archives[0]
response = client.get_archive_metadata(archive=user_archive, includeDeleted=True)


# restore a single fileId:
sessionId = client.create_bulk_restore_session(archive=user_archive, name="File Demo")
response = client.get_file(archive=user_archive, fileId="abcd1234", versionTimestamp=1677884124339)
with open("restored_file", "wb") as file:
    file.write(response.content)
```
