Notes from the 9/22/2022 meeting of TFF collaborators

  • [Ajay Kannan, Michael Reneer] Managing versioning/dependencies
    • Proposal from LinkedIn
    • [Michael] Two concerns
      • Versions TFF depends on TF and Python
      • Pythin - can we support old, can we support new
      • We support 3.9 for now, soon 3.10
    • [A] Could negotiate specific versions - let’s unpack
    • [M] Why 3.9
      • Mostly for pytype
      • May be other features - could be flag guarded
    • (lots of back and forth on nuts and bolts - didn’t take notes)
    • Resolution/action items:
      • TFF to downgrade OSS version of things to what works
      • Michael to coordinate downgrade with Ajay, Ajay to test what works
      • Revised version of the proposal to follow
      • Will need a system for periodically updating the “downgraded version” to make sure it keeps advancing
      • Ajay, Michael to propose an upgrade schedule for that
      • Revision draft async, to present next time
  • [Tong Zhou et al.] Discussion of recent experiments/findings on scalability
    • TFF Questions
    • [Tong] Question on expected length for TFF rounds
      • The extra time doesn’t seem to be spent in forward or backprop
      • Suspecting aggregation
      • Unsusprising that TFF vs. Keras performance-match for a single round
        • Reading data not a factor
        • All time is TF time
      • Data ingestion a likely suspect, needs to be measured better
        • Overlapping data ingestion and processing one of the factors,
        • In general, missed opportunities for optimization when training rounds are O(seconds)
      • Thre’s support in TFF for prefetching/preprocessing data K rounds ahead of training
        • APIs used in tutorial synchronous, but async and pipelining are natively available under the hood in the TFF runtime
        • Relevant code in OSS, just not very well exposed for use
        • Looks like it could solve the problem - to try out
      • AI on TFF team to follow up with links to how to setup ingestion and preprocessing K rounds ahead
      • Tong to follow up with new experiments
  • Async instance of next meeting possibly in 1 week
  • To follow up interactively on Discord.