Skip to content

Use obspec as dev dependency #337

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 11, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 0 additions & 4 deletions docs/api/attributes.md

This file was deleted.

3 changes: 0 additions & 3 deletions docs/api/get.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,6 @@
::: obstore.get_range_async
::: obstore.get_ranges
::: obstore.get_ranges_async
::: obstore.GetOptions
::: obstore.GetResult
::: obstore.BytesStream
::: obstore.Bytes
::: obstore.OffsetRange
::: obstore.SuffixRange
1 change: 0 additions & 1 deletion docs/api/list.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
::: obstore.list
::: obstore.list_with_delimiter
::: obstore.list_with_delimiter_async
::: obstore.ObjectMeta
::: obstore.ListResult
::: obstore.ListStream
::: obstore.ListChunkType
3 changes: 0 additions & 3 deletions docs/api/put.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,3 @@

::: obstore.put
::: obstore.put_async
::: obstore.PutResult
::: obstore.UpdateVersion
::: obstore.PutMode
2 changes: 1 addition & 1 deletion docs/blog/posts/obstore-0.4.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ Obstore version 0.5 is expected to improve on extensible credentials by enabling

## Return Arrow data from `list_with_delimiter`

By default, the [`obstore.list`][] and [`obstore.list_with_delimiter`][] APIs [return standard Python `dict`s][obstore.ObjectMeta]. However, if you're listing a large bucket, the overhead of materializing all those Python objects can become significant.
By default, the [`obstore.list`][] and [`obstore.list_with_delimiter`][] APIs [return standard Python `dict`s][obspec.ObjectMeta]. However, if you're listing a large bucket, the overhead of materializing all those Python objects can become significant.

[`obstore.list`][] and [`obstore.list_with_delimiter`][] now both support a `return_arrow` keyword parameter. If set to `True`, an Arrow [`RecordBatch`][arro3.core.RecordBatch] or [`Table`][arro3.core.Table] will be returned, which is both faster and more memory efficient.

Expand Down
2 changes: 1 addition & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,6 @@ nav:
- api/put.md
- api/rename.md
- api/sign.md
- api/attributes.md
- api/exceptions.md
- api/file.md
- obstore.fsspec: api/fsspec.md
Expand Down Expand Up @@ -157,6 +156,7 @@ plugins:
- https://arrow.apache.org/docs/objects.inv
- https://boto3.amazonaws.com/v1/documentation/api/latest/objects.inv
- https://botocore.amazonaws.com/v1/documentation/api/latest/objects.inv
- https://developmentseed.org/obspec/latest/objects.inv
- https://docs.aiohttp.org/en/stable/objects.inv
- https://docs.pola.rs/api/python/stable/objects.inv
- https://docs.python.org/3/objects.inv
Expand Down
47 changes: 0 additions & 47 deletions obstore/python/obstore/_attributes.pyi

This file was deleted.

4 changes: 3 additions & 1 deletion obstore/python/obstore/_buffered.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@ import sys
from contextlib import AbstractAsyncContextManager, AbstractContextManager
from typing import Self

from ._attributes import Attributes
# TODO: fix import
from obspec._attributes import Attributes

from ._bytes import Bytes
from ._list import ObjectMeta
from .store import ObjectStore
Expand Down
120 changes: 10 additions & 110 deletions obstore/python/obstore/_get.pyi
Original file line number Diff line number Diff line change
@@ -1,119 +1,13 @@
from collections.abc import Sequence
from datetime import datetime
from typing import TypedDict

from ._attributes import Attributes
# TODO: fix imports
from obspec._attributes import Attributes
from obspec._get import GetOptions

from ._bytes import Bytes
from ._list import ObjectMeta
from .store import ObjectStore

class OffsetRange(TypedDict):
"""Request all bytes starting from a given byte offset."""

offset: int
"""The byte offset for the offset range request."""

class SuffixRange(TypedDict):
"""Request up to the last `n` bytes."""

suffix: int
"""The number of bytes from the suffix to request."""

class GetOptions(TypedDict, total=False):
"""Options for a get request.

All options are optional.
"""

if_match: str | None
"""
Request will succeed if the `ObjectMeta::e_tag` matches
otherwise returning [`PreconditionError`][obstore.exceptions.PreconditionError].

See <https://datatracker.ietf.org/doc/html/rfc9110#name-if-match>

Examples:

```text
If-Match: "xyzzy"
If-Match: "xyzzy", "r2d2xxxx", "c3piozzzz"
If-Match: *
```
"""

if_none_match: str | None
"""
Request will succeed if the `ObjectMeta::e_tag` does not match
otherwise returning [`NotModifiedError`][obstore.exceptions.NotModifiedError].

See <https://datatracker.ietf.org/doc/html/rfc9110#section-13.1.2>

Examples:

```text
If-None-Match: "xyzzy"
If-None-Match: "xyzzy", "r2d2xxxx", "c3piozzzz"
If-None-Match: *
```
"""

if_unmodified_since: datetime | None
"""
Request will succeed if the object has been modified since

<https://datatracker.ietf.org/doc/html/rfc9110#section-13.1.3>
"""

if_modified_since: datetime | None
"""
Request will succeed if the object has not been modified since
otherwise returning [`PreconditionError`][obstore.exceptions.PreconditionError].

Some stores, such as S3, will only return `NotModified` for exact
timestamp matches, instead of for any timestamp greater than or equal.

<https://datatracker.ietf.org/doc/html/rfc9110#section-13.1.4>
"""

range: tuple[int, int] | list[int] | OffsetRange | SuffixRange
"""
Request transfer of only the specified range of bytes
otherwise returning [`NotModifiedError`][obstore.exceptions.NotModifiedError].

The semantics of this tuple are:

- `(int, int)`: Request a specific range of bytes `(start, end)`.

If the given range is zero-length or starts after the end of the object, an
error will be returned. Additionally, if the range ends after the end of the
object, the entire remainder of the object will be returned. Otherwise, the
exact requested range will be returned.

The `end` offset is _exclusive_.

- `{"offset": int}`: Request all bytes starting from a given byte offset.

This is equivalent to `bytes={int}-` as an HTTP header.

- `{"suffix": int}`: Request the last `int` bytes. Note that here, `int` is _the
size of the request_, not the byte offset. This is equivalent to `bytes=-{int}`
as an HTTP header.

<https://datatracker.ietf.org/doc/html/rfc9110#name-range>
"""

version: str | None
"""
Request a particular object version
"""

head: bool
"""
Request transfer of no content

<https://datatracker.ietf.org/doc/html/rfc9110#name-head>
"""

class GetResult:
"""Result for a get request.

Expand Down Expand Up @@ -142,6 +36,9 @@ class GetResult:

Note that after calling `bytes`, `bytes_async`, or `stream`, you will no longer be
able to call other methods on this object, such as the `meta` attribute.

This implements [`obspec.GetResult`][], but is redefined here to specialize the
exact instance of the `bytes` return type to be [`obstore.Bytes`][].
"""

@property
Expand Down Expand Up @@ -229,6 +126,9 @@ class BytesStream:

To fix this, set the `timeout` parameter in the
[`client_options`][obstore.store.ClientConfig] passed when creating the store.

This implements [`obspec.BufferStream`][], but is redefined here to specialize the
exact instance of the buffer return type to be [`obstore.Bytes`][].
"""

def __aiter__(self) -> BytesStream:
Expand Down
4 changes: 3 additions & 1 deletion obstore/python/obstore/_head.pyi
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
from ._list import ObjectMeta
# TODO: fix improt
from obspec._meta import ObjectMeta

from .store import ObjectStore

def head(store: ObjectStore, path: str) -> ObjectMeta:
Expand Down
37 changes: 12 additions & 25 deletions obstore/python/obstore/_list.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -5,40 +5,23 @@
# ruff: noqa: A001
# Variable `list` is shadowing a Python builtinRuff

from datetime import datetime
from typing import Generic, List, Literal, Self, TypedDict, TypeVar, overload

from arro3.core import RecordBatch, Table
from obspec._meta import ObjectMeta

from .store import ObjectStore

class ObjectMeta(TypedDict):
"""The metadata that describes an object."""

path: str
"""The full path to the object"""

last_modified: datetime
"""The last modified time"""

size: int
"""The size in bytes of the object"""

e_tag: str | None
"""The unique identifier for the object

<https://datatracker.ietf.org/doc/html/rfc9110#name-etag>
"""

version: str | None
"""A version indicator for this object"""

ListChunkType = TypeVar("ListChunkType", List[ObjectMeta], RecordBatch, Table) # noqa: PYI001
"""The data structure used for holding list results.

By default, listing APIs return a `list` of [`ObjectMeta`][obstore.ObjectMeta]. However
By default, listing APIs return a `list` of [`ObjectMeta`][obspec.ObjectMeta]. However
for improved performance when listing large buckets, you can pass `return_arrow=True`.
Then an Arrow `RecordBatch` will be returned instead.

This implements [`obspec.ListChunkType_co`][], but is redefined here to specialize the
exact instance of the Arrow return type, given that in the obstore implementation, an
[`arro3.core.RecordBatch`][] or [`arro3.core.Table`][] will always be returned.
"""

class ListResult(TypedDict, Generic[ListChunkType]):
Expand All @@ -47,6 +30,8 @@ class ListResult(TypedDict, Generic[ListChunkType]):
Includes objects, prefixes (directories) and a token for the next set of results.
Individual result sets may be limited to 1,000 objects based on the underlying
object storage's limitations.

This implements [`obspec.ListResult`][].
"""

common_prefixes: List[str]
Expand All @@ -56,8 +41,10 @@ class ListResult(TypedDict, Generic[ListChunkType]):
"""Object metadata for the listing"""

class ListStream(Generic[ListChunkType]):
"""A stream of [ObjectMeta][obstore.ObjectMeta] that can be polled in a sync or
"""A stream of [ObjectMeta][obspec.ObjectMeta] that can be polled in a sync or
async fashion.

This implements [`obspec.ListStream`][].
""" # noqa: D205

def __aiter__(self) -> Self:
Expand Down Expand Up @@ -170,7 +157,7 @@ def list(
```

!!! note
The order of returned [`ObjectMeta`][obstore.ObjectMeta] is not
The order of returned [`ObjectMeta`][obspec.ObjectMeta] is not
guaranteed

!!! note
Expand Down
9 changes: 0 additions & 9 deletions obstore/python/obstore/_obstore.pyi
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
from ._attributes import Attribute as Attribute
from ._attributes import Attributes as Attributes
from ._buffered import AsyncReadableFile as AsyncReadableFile
from ._buffered import AsyncWritableFile as AsyncWritableFile
from ._buffered import ReadableFile as ReadableFile
Expand All @@ -14,10 +12,7 @@ from ._copy import copy_async as copy_async
from ._delete import delete as delete
from ._delete import delete_async as delete_async
from ._get import BytesStream as BytesStream
from ._get import GetOptions as GetOptions
from ._get import GetResult as GetResult
from ._get import OffsetRange as OffsetRange
from ._get import SuffixRange as SuffixRange
from ._get import get as get
from ._get import get_async as get_async
from ._get import get_range as get_range
Expand All @@ -29,13 +24,9 @@ from ._head import head_async as head_async
from ._list import ListChunkType as ListChunkType
from ._list import ListResult as ListResult
from ._list import ListStream as ListStream
from ._list import ObjectMeta as ObjectMeta
from ._list import list as list # noqa: A004
from ._list import list_with_delimiter as list_with_delimiter
from ._list import list_with_delimiter_async as list_with_delimiter_async
from ._put import PutMode as PutMode
from ._put import PutResult as PutResult
from ._put import UpdateVersion as UpdateVersion
from ._put import put as put
from ._put import put_async as put_async
from ._rename import rename as rename
Expand Down
Loading
Loading