ArchiveItAPI Reference

The main client class for interacting with the Archive-it API.

First time using pyarchiveit?

See the Getting Started guide for installation and initialization instructions.

A client for interacting with the Archive-it API.

`init(account_name, account_password, base_url='https://partner.archive-it.org/api/', default_timeout=None)`

Parameters:

Name	Type	Description	Default
`account_name`	`str`	The account name for authentication.	required
`account_password`	`str`	The account password for authentication.	required
`base_url`	`str`	The base URL for the API endpoints. Defaults to Archive-it API base URL.	`'https://partner.archive-it.org/api/'`
`default_timeout`	`float \| None`	Default timeout in seconds. Defaults to None. Use None for no timeout.	`None`

`create_seed(url, collection_id, crawl_definition_id, other_params=None, metadata=None)`

Create a new seed in a specified collection with given crawl definition.

Parameters:

Name	Type	Description	Default
`url`	`str`	The URL of the seed to create.	required
`collection_id`	`str \| int`	The ID of the collection to add the seed to.	required
`crawl_definition_id`	`str \| int`	The ID of the crawl definition to associate with the seed.	required
`other_params`	`dict \| None`	Additional parameters for the seed creation.	`None`
`metadata`	`dict \| None`	Metadata to set for the seed after creation.	`None`

Returns:

Name	Type	Description
`dict`	`dict`	The validated created seed data returned by the API.

Raises:

Type	Description
`ValidationError`	If the input data or metadata structure is invalid.

`delete_seed(seed_id)`

Delete a seed by its ID.

Parameters:

Name	Type	Description	Default
`seed_id`	`str \| int`	The ID of the seed to delete.	required

Returns:

Name	Type	Description
`dict`	`dict`	The validated seed data from the API after deletion. The 'deleted' flag should be True.

Raises:

Type	Description
`ValidationError`	If the API returns invalid seed data.

`get_seed_by_id(seed_id)`

Get a seed by its ID.

Parameters:

Name	Type	Description	Default
`seed_id`	`str \| int`	The ID of the seed to retrieve.	required

Returns:

Name	Type	Description
`dict`	`dict`	The validated seed data returned by the API.

Raises:

Type	Description
`HTTPStatusError`	If the API request fails.
`TimeoutException`	If the request times out.
`ValidationError`	If the API returns invalid seed data.

`get_seed_list(collection_id, limit=-1, sort=None, pluck=None, format='json', additional_query=None)`

Get seeds for a given collection ID or list of collection IDs.

Parameters:

Name	Type	Description	Default
`collection_id`	`str \| int \| list[str \| int]`	Collection ID or list of Collection IDs.	required
`limit`	`int`	Maximum number of seeds to retrieve per collection. Defaults to -1 (no limit).	`-1`
`sort`	`str \| None`	Sort order based on the result. Negative values (-) indicate ascending order. Defaults to None. See the available fields in the API documentation (Data Models > Seed). Example values: "id", "-id", "last_updated_date", "-last_updated_date".	`None`
`pluck`	`str \| None`	Specific field to extract from each seed object (e.g. "url", "id" ). Defaults to None (returns full seed objects).	`None`
`format`	`str`	The format of the response (json or xml). Defaults to "json".	`'json'`
`additional_query`	`dict`	Additional query parameters to include in the request. Can either be a string or list. A list means to query for multiple values for that parameter (OR statement). Format: {"param_name": } e.g. {"last_updated_by": "PersonA"} or {"last_updated_by": ["PersonA", "PersonB"]}.	`None`

Returns:

Type	Description
`list`	list[SeedKeys] \| list: If pluck is None, returns list of validated seed objects. If pluck is specified, returns list of the plucked field values.

Raises:

Type	Description
`HTTPStatusError`	If the API request fails.
`TimeoutException`	If the request times out.
`ValidationError`	If the API returns invalid seed data.
`ValueError`	If the `sort` parameter is invalid.

`get_seed_with_metadata(metadata_field=None, metadata_value=None, limit=-1, pluck=None)`

Get seeds that match a specific metadata field and value.

Parameters:

Name	Type	Description	Default
`metadata_field`	`str \| None`	The metadata field to search (e.g., "Title", "Author").	`None`
`metadata_value`	`str \| None`	The value to search for within the specified metadata field.	`None`
`limit`	`int`	Maximum number of seeds to retrieve. Defaults to -1 (no limit).	`-1`
`pluck`	`str \| None`	Specific field to extract from each seed object (e.g. "collection"). Defaults to None (returns full seed objects).	`None`

`search_seed_metadata(metadata_field=None, metadata_value=None, limit=-1, pluck=None)`

Search seeds by metadata field and value.

Note

It is not necessary to search with the metadata_field to search for the value. If you just want to look up a value across all metadata fields, simply pass the value to metadata_value and leave metadata_field as None.

Parameters:

Name	Type	Description	Default
`metadata_field`	`str \| list \| None`	The metadata field to search (e.g., "Title", "Author"). If a list is provided, searches within any of the fields.	`None`
`metadata_value`	`str \| list \| None`	The value to search for within the specified metadata field. If a list is provided, searches for any of the values.	`None`
`limit`	`int`	Maximum number of seeds to retrieve. Defaults to -1 (no limit).	`-1`
`pluck`	`str \| None`	Specific field to extract from each seed object (e.g. "seed", "name_control"). Defaults to None (returns full seed objects).	`None`

Returns:

Name	Type	Description
`list`	`list`	A list of seeds matching the search criteria.

`update_seed_metadata(seed_id, metadata)`

Update metadata for a specific seed.

Parameters:

Name	Type	Description	Default
`seed_id`	`str \| int`	The ID of the seed to update.	required
`metadata`	`dict`	The metadata to update for the seed.	required

Raises:

Type	Description
`ValidationError`	If the metadata structure is invalid.

ArchiveItAPI Reference

__init__(account_name, account_password, base_url='https://partner.archive-it.org/api/', default_timeout=None)

create_seed(url, collection_id, crawl_definition_id, other_params=None, metadata=None)

delete_seed(seed_id)

get_seed_by_id(seed_id)

get_seed_list(collection_id, limit=-1, sort=None, pluck=None, format='json', additional_query=None)

get_seed_with_metadata(metadata_field=None, metadata_value=None, limit=-1, pluck=None)

search_seed_metadata(metadata_field=None, metadata_value=None, limit=-1, pluck=None)

update_seed_metadata(seed_id, metadata)

`init(account_name, account_password, base_url='https://partner.archive-it.org/api/', default_timeout=None)`

`create_seed(url, collection_id, crawl_definition_id, other_params=None, metadata=None)`

`delete_seed(seed_id)`

`get_seed_by_id(seed_id)`

`get_seed_list(collection_id, limit=-1, sort=None, pluck=None, format='json', additional_query=None)`

`get_seed_with_metadata(metadata_field=None, metadata_value=None, limit=-1, pluck=None)`

`search_seed_metadata(metadata_field=None, metadata_value=None, limit=-1, pluck=None)`

`update_seed_metadata(seed_id, metadata)`