Skip to content

Syncing ..cntd #311

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

ShootingKing-AM
Copy link
Contributor

@ShootingKing-AM ShootingKing-AM commented Oct 28, 2022

Object:

  • To Sync life logged data of intrest to other devices using a 3rd party syncing service

HLD:

HLD
  • Overall Logic
    1. Periodically get all data of intrest after a certain timestamp to the 3rd party sync dir (maybe Dropbox's dir)
      1. Make a folder with current device identifier in sync dir
      2. Copy data to that folder
      3. maintain this sync time in db
      4. Recheck sync'ed time + Schedule future sync and goto(1)
    2. Periodically get other devices data
      1. Find other devices folders in sync dir and match with last sync timing data (based on current device db lastsync and other device's file modification time)
      2. for each new device data
        1. Copy other devices's data to a staging dir
        2. Merge other device's data from staging into current device's Database
      3. and goto (1)
  • Specifciations
    • Data
      • data is in form of sqlite.database reconstructed from current device's database with only data of intrest
      • data of interest - Default (AFK watcher and Window watcher) or User specified buckets
      • device identifier - presently hostname
      • current device meaning the device on which aw-sync is running
      • development and initial production - have min 2 historical copies of files when writing to them
    • Config
      • sync dir is User specificed or default {LocalDataDir}/activitywatch/aw-server-rust/aw-sync/sync/{host_ids}
      • staging is default {LocalDataDir}/activitywatch/aw-server-rust/aw-sync/staging/{host_ids}
      • User specification from config file (in future can be from webui)
    • Timing
      • Timeperiod for both up and downstream 10min Default or User specified
      • generate db data after certain timestamp : Last sync time minus 1hr There can be new devices on network, hence need data from start about other devices, meaning all upstream pushes are complete data pushes i.e from the point ActivityWatch started logging data
      • maintaining sync times in db table sync (ID AUTO_INCREMENT PRIMARY, host {0 for self}, LAST SYNC Timestamp)
    • Arch
      • Should be scalable for future P2P decentralized-no-3rd-party model, hence sync-framework should as far as timely possible be isolated from file-copy-paste actions
      • aw-sync a rust binary or a module ? a very thin cli wrapped over a module, with instant sync(even for testing) plus continous periodic-syncs functionality - but if cli how will we get the Datastore which is opened and running?
      • Multi-threaded - One for each upstream and downstream and main thread to keep track of them and
      • future - on exit - EXIT GRACEFULLY - close files, stop data insersion etc. etc. - required to ensure no data loss

LLD pseudocode - oversimplified:

LLD
main.rs:

    import SyncConfig

    main()
        read_args() // sync_mode[push|pull|sync], sync_adaptor=[file] sync_adaptor_option={sync_dir, sync_staging_dir}, optional:{server_ip, port} - clap
        validate_args()
        set SyncConfig including adaptor_options
        
        match args:
            push:
                sync_push() // one time push
            pull:
                sync_pull() // one time pull
            sync:
                sync() // continous
sync.rs:

    SyncAdaptor trait to have
        push()
        pull()

    struct SyncConfig
        adaptor: file //can be p2p
        adaptor_options: Vec<> // sync_dir,  staging_dir
        mode: SyncMode
        Server_IP:
        Port:
        buckets:

    struct SyncMode
        PUSH, PULL, BOTH

    fn sync_push()
        // based on setting -sync_adaptor file
        match adaptor
            file_push()

    fn sync_pull()
        // based on setting -sync_adaptor file
        match adaptor
            file_pull()

    fn sync()
        set_up_sync() // Select adaptor and push adaptor specifc config data (here sync_dir, staging_dir)
        start_push_Thread()
        start_pull_Thread()

    fn poll_for_sync(SyncMode)
        presentTime > nextSyncTime

    fn force_push()
        // return true to force_push before nextSyncTime from thread

    fn force_pull()
        // return true to force_pull before nextSyncTime from thread

    push_Thread_main()
        poll_for_sync || force_push()
            sync_push()
        else
            sleep(poll_time=1000s)
    
    pull_Thread_main()
        poll_for_sync || force_pull()
            sync_pull()
        else
            sleep(poll_time=1000s)

file.rs:
    file implements SyncAdaptor
        fn file_pull()
            sys_io()
                find_other_device_data_folders()
                for each other device data()
                    if _should_pull_file(otherdevice_dataFile)
                        copy_to_staging()
                        work_on_staging()
                            aw_io()
                                odDataStore_buckets = other_devise_open_data()::get_buckets() // sqlite.db DataStore:: API
                                self_DataStore = current_device_ds::open()
                                setup_bucket_self_dataStore() // if not exists create 
                                for each bucket:
                                    odDataStore_buckets.getEvents()
                                    self_DataStore.buckets.InsertEvents()

        fn file_push()
            sys_io()
                copy_current_device_db_to_staging() // to avoid db manupulation while sync-ing - but if its open presently how to handle ? 
                // make a DataStore BUSY flag to pause Datastore manupulations and then copy the underlying sqlite file ?
                aw_io()
                    self_DataStore_buckets = current_device_ds::open() // which is copied to staging folder
                    staging_db::datastore = setup_current_device_db() // open new sqlite staging db for current device to created db with "filtered" buckets
                    for each bucket of Intrest from SyncConfig::buckets in self_Datastore_buckets
                        staging_db::CreateBucket
                        self_DataStore_buckets::getEvents()
                        staging_db::DS.insertEvents()

        fn _should_pull_file(otherDevice_datafile)
            if modified_date_file() > last_modified_date()
                true
        
		fn _copy_to_staging(...)
			_backup_staging() // 5 revisions
			sys_io().clean_files()

Todo

  • main.rs:

    • main()
      • read_args() // sync_mode[push|pull|sync], sync_adaptor=[file] sync_adaptor_option={sync_dir, sync_staging_dir}, optional:{server_ip, port} - clap
      • validate_args()
  • sync.rs:

    • SyncAdaptor trait to have - push(), pull()
    • struct SyncConfig - adaptor: file //can be p2p ,adaptor_options: Vec<> // sync_dir, staging_dir, mode: SyncMode, Server_IP:, Port:, buckets:
    • struct SyncMode - PUSH, PULL, BOTH
    • fn sync_push()
    • fn sync_pull()
    • fn sync()
      • set_up_sync() // Select adaptor and push adaptor specifc config data (here sync_dir, staging_dir)
      • start_push_Thread()
      • start_pull_Thread()
    • fn poll_for_sync(SyncMode)
    • fn force_push()
    • fn force_pull()
    • fu push_Thread_main()
    • fn pull_Thread_main()
  • file.rs: file implements SyncAdaptor

    • fn file_pull()
      • find_other_device_data_folders()
      • should_pull_file(otherdevice_dataFile)
        • copy_to_staging()
        • work_on_staging()
          • aw_io()
            • setup_bucket_self_dataStore() // if not exists create
            • for each bucket:
              • odDataStore_buckets.getEvents()
              • self_DataStore.buckets.InsertEvents()
    • fn file_push()
      • copy_current_device_db_to_staging() // to avoid db manupulation while sync-ing - but if its open presently how to handle ?
      • aw_io()
        • self_DataStore_buckets = current_device_ds::open() // which is copied to staging folder
        • staging_db::datastore = setup_current_device_db() // open new sqlite staging db for current device to created db with "filtered" buckets
        • for each bucket of Intrest from SyncConfig::buckets in self_Datastore_buckets
          • staging_db::CreateBucket
          • self_DataStore_buckets::getEvents()
          • staging_db::DS.insertEvents()
    • fn should_pull_file(otherDevice_datafile)
    • fn copy_to_staging(...)
      • backup_staging() // 5 revisions
      • sys_io().clean_files()

@ShootingKing-AM ShootingKing-AM marked this pull request as draft October 28, 2022 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant