java - How to properly initialize task state at Apache Flink? -
i working on financial anti-fraud system, based on apache flink. need calculate many different aggregates, based on financial transactions. use kafka stream data source. example, in average transaction amount calculation use mapstate storing total transactions count , total amount per card. aggregated data stored @ apache accumulo. know persistent states in flink, not need. there way load initial data flink before computation begins? can done using 2 connected streams data accumulo latest computed aggregates , transactions stream? transactions stream infinite, aggregates stream not. way should dig to? appreciated.
i've thought asyncio, states can't used async functions. idea was: check aggregates @ in-memory state. if there no data card here - code makes call storage service, fetch data it, performs computations , updates in-memory state, so, next transaction card don't need processed call external data service. think big bottleneck.
you try way:
task::setinitialstate task::invoke create basic utils (config, etc) , load chain of operators setup-operators task-specific-init initialize-operator-states open-operators run close-operators dispose-operators task-specific-cleanup common-cleanup
Comments
Post a Comment