Skip to content

ArchiveItAPI Reference

The main client class for interacting with the Archive-it API.

First time using pyarchiveit?

See the Getting Started guide for installation and initialization instructions.

A client for interacting with the Archive-it API.

__init__(account_name, account_password, base_url='https://partner.archive-it.org/api/', default_timeout=None)

Initialize the ArchiveItAPI client with authentication and base URL.

Parameters:

Name Type Description Default
account_name str

The account name for authentication.

required
account_password str

The account password for authentication.

required
base_url str

The base URL for the API endpoints. Defaults to Archive-it API base URL.

'https://partner.archive-it.org/api/'
default_timeout float | None

Default timeout in seconds. Defaults to None. Use None for no timeout.

None

get_seed_list(collection_id, limit=-1, format='json', timeout=None)

Get seeds for a given collection ID or list of collection IDs.

Parameters:

Name Type Description Default
collection_id str | int | list[str | int]

Collection ID or list of Collection IDs.

required
limit int

Maximum number of seeds to retrieve per collection. Defaults to -1 (no limit).

-1
format str

The format of the response (json or xml). Defaults to "json".

'json'
timeout float | None

Timeout in seconds for this request. Uses client default if not specified.

None

Returns:

Type Description
list[dict]

list[dict]: List of seeds from all requested collections.

Raises:

Type Description
HTTPStatusError

If the API request fails.

TimeoutException

If the request times out.

update_seed_metadata(seed_id, metadata)

Update metadata for a specific seed.

Parameters:

Name Type Description Default
seed_id str | int

The ID of the seed to update.

required
metadata dict

The metadata to update for the seed.

required

create_seed(url, collection_id, crawl_definition_id, other_params=None, metadata=None)

Create a new seed in a specified collection with given crawl definition.

Parameters:

Name Type Description Default
url str

The URL of the seed to create.

required
collection_id str | int

The ID of the collection to add the seed to.

required
crawl_definition_id str | int

The ID of the crawl definition to associate with the seed.

required
other_params dict | None

Additional parameters for the seed creation.

None
metadata dict | None

Metadata to set for the seed after creation.

None

Returns:

Name Type Description
dict dict

The created seed data returned by the API.

delete_seed(seed_id)

Delete a seed by its ID.

Parameters:

Name Type Description Default
seed_id str | int

The ID of the seed to delete.

required

Returns:

Name Type Description
dict dict

The seed data from the API after deletion. If successful, the 'deleted' flag should be True.