Skip to content

Cognition integration provider #302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 71 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
24e6e5c
perf(alembic): adds integration providers
andhreljaKern May 15, 2025
f7efa5d
chore: update submodules
andhreljaKern May 15, 2025
e425c84
perf: rename integration providers
andhreljaKern May 16, 2025
2d54ed6
chore: update submodules
andhreljaKern May 16, 2025
b246d1d
chore: update submodules
andhreljaKern May 16, 2025
3d36c65
perf: update integration last_extraction
andhreljaKern May 16, 2025
4b0ab79
chore: update submodules
andhreljaKern May 16, 2025
5620cd0
perf(alembic): use list integration access types
andhreljaKern May 16, 2025
fcce2c7
chore: update submodules
andhreljaKern May 16, 2025
bb5746c
perf: add integration providers
andhreljaKern May 16, 2025
ae22650
chore: update submodules
andhreljaKern May 16, 2025
d78ab1a
perf(alembic): adds integration providers
andhreljaKern May 16, 2025
a776438
chore: update submodules
andhreljaKern May 16, 2025
b3eae84
perf: update integration providers
andhreljaKern May 16, 2025
1b0bd86
perf: task manipulation
andhreljaKern May 16, 2025
2728ba4
chore: update submodules
andhreljaKern May 16, 2025
c7a2b7c
chore: update submodules
andhreljaKern May 20, 2025
5171baa
perf(alembic): integration provider
andhreljaKern May 20, 2025
e781ccb
perf: add org_id to integration provider
andhreljaKern May 26, 2025
3893760
chore: update submodules
andhreljaKern May 26, 2025
0f8a518
chore: update submodules
andhreljaKern May 26, 2025
02e8655
chore: update submodules
andhreljaKern May 26, 2025
5632966
perf(alembic): recreate integration providers
andhreljaKern May 26, 2025
d5c72e9
chore: update submodules
andhreljaKern May 26, 2025
5bfeb6b
perf(alembic): add started_at
andhreljaKern May 26, 2025
d1494c6
chore: update submodules
andhreljaKern May 27, 2025
550332f
perf(alembic): add integration records
andhreljaKern May 27, 2025
308df12
chore: update submodules
andhreljaKern May 27, 2025
b4e8cd2
chore: update submodules
andhreljaKern May 27, 2025
ecd7a8b
perf: update integration providers
andhreljaKern May 27, 2025
6406398
chore: update submodules
andhreljaKern May 27, 2025
c904383
perf(alembic): sharepoint integration
andhreljaKern May 27, 2025
9e5a363
chore: update submodules
andhreljaKern May 29, 2025
775d788
perf(alembic): add integrations
andhreljaKern May 29, 2025
c440d53
chore: update submodules
andhreljaKern May 29, 2025
d849ef5
perf(alembic): integration tables
andhreljaKern May 29, 2025
fd0267d
chore: update submodules
andhreljaKern Jun 3, 2025
cbb034b
chore: merge submodules
andhreljaKern Jun 3, 2025
a14138a
perf(alembic): db update
andhreljaKern Jun 3, 2025
99fd438
chore: update submodules
andhreljaKern Jun 3, 2025
00c3667
perf(alembic): db update
andhreljaKern Jun 3, 2025
c4f3466
chore: update submodules
andhreljaKern Jun 3, 2025
c8f5118
perf(alembic): integration sync updates
andhreljaKern Jun 3, 2025
d998b48
chore: update submodules
andhreljaKern Jun 13, 2025
3f0aa18
perf(alembic): added column
andhreljaKern Jun 13, 2025
841a968
chore: update submodules
andhreljaKern Jun 16, 2025
3fa5341
perf(alembic): db upgrade
andhreljaKern Jun 16, 2025
5540a1a
Merge branch 'dev' into cognition-integration-provider
andhreljaKern Jun 18, 2025
74c25b3
chore: update submodules
andhreljaKern Jun 24, 2025
43c8a45
perf: add project deletion internal endpoint
andhreljaKern Jun 25, 2025
d33d1ad
chore: update submodules
andhreljaKern Jun 25, 2025
70e3f2f
perf: update internal projects delete endpoint
andhreljaKern Jun 25, 2025
53930c4
perf(alembic): db update
andhreljaKern Jun 25, 2025
a48e0c3
chore: update submodules
andhreljaKern Jun 25, 2025
5f158f8
perf(alembic): db updates
andhreljaKern Jun 25, 2025
fec2418
chore: update submodules
andhreljaKern Jun 25, 2025
1a6991e
Merge branch 'dev' into cognition-integration-provider
andhreljaKern Jun 25, 2025
bd53656
chore: update submodules
andhreljaKern Jun 26, 2025
c6726e7
perf(alembic): db update
andhreljaKern Jun 26, 2025
53393a1
chore: update submodules
andhreljaKern Jun 26, 2025
2bd60c8
Merge branch 'dev' into cognition-integration-provider
andhreljaKern Jun 26, 2025
10eb229
Adding groups for access management (#304)
lumburovskalina Jun 26, 2025
3bdb623
chore: update submodules
andhreljaKern Jun 26, 2025
950036a
perf(alembic): db alignment
andhreljaKern Jun 26, 2025
3e4f540
chore: update submodules
andhreljaKern Jun 26, 2025
95cb724
perf: rename task
andhreljaKern Jun 26, 2025
70a850d
perf: add fail-safe while wait
andhreljaKern Jun 27, 2025
6f74eae
chore: update submodules
andhreljaKern Jun 27, 2025
ea94747
perf: pr review comments
andhreljaKern Jun 27, 2025
59a7e14
chore: update submodules
andhreljaKern Jun 27, 2025
93b6a66
perf: make REFINERY_ATTRIBUTE_ACCESS constants
andhreljaKern Jun 27, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
502 changes: 502 additions & 0 deletions alembic/versions/36f087da55b1_adds_integration_tables.py

Large diffs are not rendered by default.

107 changes: 107 additions & 0 deletions alembic/versions/6868ac66ea92_adds_cognition_group_management.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
"""adds cognition group management

Revision ID: 6868ac66ea92
Revises: 36f087da55b1
Create Date: 2025-06-26 12:58:16.408919

"""

from alembic import op
import sqlalchemy as sa
from sqlalchemy.dialects import postgresql

# revision identifiers, used by Alembic.
revision = "6868ac66ea92"
down_revision = "36f087da55b1"
branch_labels = None
depends_on = None


def upgrade():
# ### commands auto generated by Alembic - please adjust! ###
op.create_table(
"group",
sa.Column("id", postgresql.UUID(as_uuid=True), nullable=False),
sa.Column("organization_id", postgresql.UUID(as_uuid=True), nullable=True),
sa.Column("name", sa.String(), nullable=True),
sa.Column("description", sa.String(), nullable=True),
sa.Column("created_at", sa.DateTime(), nullable=True),
sa.Column("created_by", postgresql.UUID(as_uuid=True), nullable=True),
sa.Column("meta_data", sa.JSON(), nullable=True),
sa.ForeignKeyConstraint(["created_by"], ["user.id"], ondelete="SET NULL"),
sa.ForeignKeyConstraint(
["organization_id"], ["organization.id"], ondelete="CASCADE"
),
sa.PrimaryKeyConstraint("id"),
sa.UniqueConstraint("name"),
schema="cognition",
)
op.create_index(
op.f("ix_cognition_group_created_by"),
"group",
["created_by"],
unique=False,
schema="cognition",
)
op.create_index(
op.f("ix_cognition_group_organization_id"),
"group",
["organization_id"],
unique=False,
schema="cognition",
)
op.create_table(
"group_member",
sa.Column("id", postgresql.UUID(as_uuid=True), nullable=False),
sa.Column("group_id", postgresql.UUID(as_uuid=True), nullable=True),
sa.Column("user_id", postgresql.UUID(as_uuid=True), nullable=True),
sa.Column("created_at", sa.DateTime(), nullable=True),
sa.ForeignKeyConstraint(
["group_id"], ["cognition.group.id"], ondelete="CASCADE"
),
sa.ForeignKeyConstraint(["user_id"], ["user.id"], ondelete="CASCADE"),
sa.PrimaryKeyConstraint("id"),
schema="cognition",
)
op.create_index(
op.f("ix_cognition_group_member_group_id"),
"group_member",
["group_id"],
unique=False,
schema="cognition",
)
op.create_index(
op.f("ix_cognition_group_member_user_id"),
"group_member",
["user_id"],
unique=False,
schema="cognition",
)
op.add_column("user", sa.Column("oidc_identifier", sa.String(), nullable=True))
# ### end Alembic commands ###


def downgrade():
# ### commands auto generated by Alembic - please adjust! ###
op.drop_column("user", "oidc_identifier")
op.drop_index(
op.f("ix_cognition_group_member_user_id"),
table_name="group_member",
schema="cognition",
)
op.drop_index(
op.f("ix_cognition_group_member_group_id"),
table_name="group_member",
schema="cognition",
)
op.drop_table("group_member", schema="cognition")
op.drop_index(
op.f("ix_cognition_group_organization_id"),
table_name="group",
schema="cognition",
)
op.drop_index(
op.f("ix_cognition_group_created_by"), table_name="group", schema="cognition"
)
op.drop_table("group", schema="cognition")
# ### end Alembic commands ###
4 changes: 2 additions & 2 deletions api/transfer.py
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@ def __calculate_missing_attributes(project_id: str, user_id: str) -> None:
i += 1
if i >= 60:
i = 0
daemon.reset_session_token_in_thread()
general.remove_and_refresh_session(request_new=True)
if tokenization.is_doc_bin_creation_running_or_queued(project_id):
time.sleep(2)
continue
Expand All @@ -211,7 +211,7 @@ def __calculate_missing_attributes(project_id: str, user_id: str) -> None:
break
if i >= 60:
i = 0
daemon.reset_session_token_in_thread()
general.remove_and_refresh_session(request_new=True)

current_att_id = attribute_ids[0]
current_att = attribute.get(project_id, current_att_id)
Expand Down
5 changes: 5 additions & 0 deletions app.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
from fast_api.routes.organization import router as org_router
from fast_api.routes.project import router as project_router
from fast_api.routes.project_setting import router as project_setting_router
from fast_api.routes.project_internal import router as project_internal_router
from fast_api.routes.misc import router as misc_router
from fast_api.routes.comment import router as comment_router
from fast_api.routes.attribute import router as attribute_router
Expand Down Expand Up @@ -43,6 +44,7 @@
PREFIX_ORGANIZATION,
PREFIX_PROJECT,
PREFIX_PROJECT_SETTING,
PREFIX_PROJECT_INTERNAL,
PREFIX_MISC,
PREFIX_COMMENT,
PREFIX_ATTRIBUTE,
Expand Down Expand Up @@ -121,6 +123,9 @@
fastapi_app_internal.include_router(
record_internal_router, prefix=PREFIX_RECORD_INTERNAL, tags=["record-internal"]
)
fastapi_app_internal.include_router(
project_internal_router, prefix=PREFIX_PROJECT_INTERNAL, tags=["project-internal"]
)


routes = [
Expand Down
12 changes: 12 additions & 0 deletions controller/auth/kratos.py
Original file line number Diff line number Diff line change
Expand Up @@ -263,3 +263,15 @@ def check_user_exists(email: str) -> bool:
if i["traits"]["email"].lower() == email.lower():
return True
return False


def get_user_from_search(email: str) -> bool:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can the existing function get_userid_from_mail be used to avoid code duplication?

request = requests.get(
f"{KRATOS_ADMIN_URL}/identities?preview_credentials_identifier_similar={quote(email)}"
)
if request.ok:
identities = request.json()
for i in identities:
if i["traits"]["email"].lower() == email.lower():
return i
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typing, same applies to return None below

return None
11 changes: 11 additions & 0 deletions controller/monitor/manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,3 +115,14 @@ def cancel_parse_cognition_file_task(
transformation_key,
with_commit=True,
)


def cancel_integration_task(
task_info: Dict[str, Any],
) -> None:

integration_id = task_info.get("integrationId")

task_monitor.set_integration_task_to_failed(
integration_id, error_message="Cancelled by task manager"
)
79 changes: 79 additions & 0 deletions controller/project/manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,16 +15,24 @@
data_slice,
information_source,
general,
attribute,
embedding,
)
from submodules.model import daemon
from fast_api.types import HuddleData, ProjectSize
from controller.task_master import manager as task_master_manager
from submodules.model.enums import TaskType, RecordTokenizationScope
from submodules.model.business_objects import util as db_util
from submodules.model.integration_objects.helper import (
REFINERY_ATTRIBUTE_ACCESS_GROUPS,
REFINERY_ATTRIBUTE_ACCESS_USERS,
)
from submodules.s3 import controller as s3
from service.search import search
from controller.auth import kratos
from submodules.model.util import sql_alchemy_to_dict
from controller.embedding import connector


ALL_PROJECTS_WHITELIST = {
"id",
Expand Down Expand Up @@ -53,6 +61,77 @@ def get_all_projects(organization_id: str) -> List[Project]:
return project.get_all(organization_id)


def get_all_projects_with_access_management(organization_id: str) -> List[Project]:
return project.get_all_with_access_management(organization_id)


def activate_access_management(project_id):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typing

relative_position = attribute.get_relative_position(project_id)
if relative_position is None:
relative_position = 1
else:
relative_position += 1
filter_attributes = [
REFINERY_ATTRIBUTE_ACCESS_GROUPS,
REFINERY_ATTRIBUTE_ACCESS_USERS,
]
attribute.create(
project_id=project_id,
relative_position=relative_position,
name=filter_attributes[0],
data_type=enums.DataTypes.PERMISSION.value,
user_created=False,
visibility=enums.AttributeVisibility.HIDE.value,
with_commit=True,
state=enums.AttributeState.AUTOMATICALLY_CREATED.value,
)
attribute.create(
project_id=project_id,
relative_position=relative_position + 1,
name=filter_attributes[1],
data_type=enums.DataTypes.PERMISSION.value,
user_created=False,
visibility=enums.AttributeVisibility.HIDE.value,
with_commit=True,
state=enums.AttributeState.AUTOMATICALLY_CREATED.value,
)
all_embeddings = embedding.get_all_embeddings_by_project_id(project_id)
for embedding_item in all_embeddings:
prev_filter_attributes = embedding_item.filter_attributes or []
new_filter_attributes = list(set(prev_filter_attributes + filter_attributes))
embedding_item.filter_attributes = new_filter_attributes
general.commit()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commit after the for loop? the if statement below also issues another commit

if connector.update_attribute_payloads_for_neural_search(
project_id, str(embedding_item.id)
):
embedding.update_embedding_filter_attributes(
project_id,
str(embedding_item.id),
new_filter_attributes,
with_commit=True,
)


def deactivate_access_management(project_id: str) -> None:
record.delete_access_management_attributes(project_id)
access_groups_attribute = attribute.get_by_name(
project_id, REFINERY_ATTRIBUTE_ACCESS_GROUPS
)
access_users_attribute = attribute.get_by_name(
project_id, REFINERY_ATTRIBUTE_ACCESS_USERS
)
if access_groups_attribute:
attribute.delete(project_id, access_groups_attribute.id, with_commit=True)
if access_users_attribute:
attribute.delete(project_id, access_users_attribute.id, with_commit=True)


def is_access_management_activated(project_id: str) -> bool:
access_groups = attribute.get_by_name(project_id, REFINERY_ATTRIBUTE_ACCESS_GROUPS)
access_users = attribute.get_by_name(project_id, REFINERY_ATTRIBUTE_ACCESS_USERS)
return access_groups is not None and access_users is not None


def get_all_projects_by_user(organization_id) -> List[Project]:
projects = project.get_all_by_user_organization_id(organization_id)
project_dicts = sql_alchemy_to_dict(
Expand Down
Loading