1 Comment

  1. This is a cumulative story summing up activities required or recommended for TF Operator Productization before it or its depended stories can be released.

    The following issues have to be resolved in scope of this story:

    1. There is no TF Operator Component in this Jira and there's an open question which Jira is correct for this component - TF Operator (Juniper or TF)

    2. TF Operator needs to be added to CI. It's not at the moment.

    3. TF Operator needs to land in github.com/tungstenfabric/tf-operator repo (supposedly)

    4. TF Operator needs its implementation to get productized

      1. current Provisioner and StatusMonitor containers are not present in the build system. Update: it’s decided to roll them back to existing production-ready NodeManager and Provisioner which are already in the build

      2. current repository code needs to be partitioned because right now it consists of mix of opensource (supposedly) and commercial parts (OVA, Contrail Command, AppFormix).

      3. the Operator doesn't react to parameters changes - configmaps are renewed but services are not notified neither containers are restarted. Sighup or restart of containers are to be introduced everywhere.

      4. some analytics services aren't present - snmp, alarm, toplogy. They are to be added to separate pods in the as same way is done for other deployers.

      5. stock containers used (cassandra, zookeeper, etc…) might have security vulnerabilities - other deployers switched to our own containers because stock ones didn't address vulnerabilities. The srock ones are to be rechecked with aquascan and in case of security issues, our own containers are to replace them

      6. webui doesn't work - to be fixed

      7. contrail-status doesn't work - to be fixed

      8. Vrouter clean-up.sh doesn’t work as it relies on sig bash handler - to be fixed

      9. Multi NIC setup is not supported - to be fixed

      10. redis ssl is not supported - to be fixed

      11. Cassandra nodetool drain is to be done periodically - restored nodemanager would fix that

      12. Hugepages 1gb and 2mb are not supported - to be fixed

      13. Rabbitmq config has less options needed for production that are in other orchestrators (RABBITMQ_MIRRORED_QUEUE_MODE, password is base64, must be sha256 for security, inet_dist_listen options, tcp_listen_options, loglevel control)

      14. device manager doestn work - to be fixed

      15. Analytics DB should be separate from Config DB in production deployments

      16. Using 0.0.0.0 for listening - to be fixed for security reason

      17. Containers parameters customizations via approach <SERVICE>__<SECTION>__<VARNAME>=val approach (via env variables) - to be fixed restricts abilities to apply W/A on customers setups

    5. TF Operator doesn't fit declared best practices for OpenShift and Operators. Right now it's a single big Operator while best practices are to create separate Operators for each distinct component/role (it's optional but quite recommended to do before releasing it for the first time)

    6. TF Operator OpenShift repo better be separate from TF Operator (optional but highly recommended). It's not the case now.

    These items are to become subtasks and dealt individually. Most probably more to come.