TF

Created by Szymon Golebiewski on Feb 03, 2021

operator-project

1 Comment

Alexey Morlang
This is a cumulative story summing up activities required or recommended for TF Operator Productization before it or its depended stories can be released.
The following issues have to be resolved in scope of this story:
1. There is no TF Operator Component in this Jira and there's an open question which Jira is correct for this component - TF Operator (Juniper or TF)
2. TF Operator needs to be added to CI. It's not at the moment.
3. TF Operator needs to land in github.com/tungstenfabric/tf-operator repo (supposedly)
4. TF Operator needs its implementation to get productized
  current Provisioner and StatusMonitor containers are not present in the build system. Update: it’s decided to roll them back to existing production-ready NodeManager and Provisioner which are already in the build
  current repository code needs to be partitioned because right now it consists of mix of opensource (supposedly) and commercial parts (OVA, Contrail Command, AppFormix).
  the Operator doesn't react to parameters changes - configmaps are renewed but services are not notified neither containers are restarted. Sighup or restart of containers are to be introduced everywhere.
  some analytics services aren't present - snmp, alarm, toplogy. They are to be added to separate pods in the as same way is done for other deployers.
  stock containers used (cassandra, zookeeper, etc…) might have security vulnerabilities - other deployers switched to our own containers because stock ones didn't address vulnerabilities. The srock ones are to be rechecked with aquascan and in case of security issues, our own containers are to replace them
  webui doesn't work - to be fixed
  contrail-status doesn't work - to be fixed
  Vrouter clean-up.sh doesn’t work as it relies on sig bash handler - to be fixed
  Multi NIC setup is not supported - to be fixed
  redis ssl is not supported - to be fixed
  Cassandra nodetool drain is to be done periodically - restored nodemanager would fix that
  Hugepages 1gb and 2mb are not supported - to be fixed
  Rabbitmq config has less options needed for production that are in other orchestrators (RABBITMQ_MIRRORED_QUEUE_MODE, password is base64, must be sha256 for security, inet_dist_listen options, tcp_listen_options, loglevel control)
  device manager doestn work - to be fixed
  Analytics DB should be separate from Config DB in production deployments
  Using 0.0.0.0 for listening - to be fixed for security reason
  Containers parameters customizations via approach <SERVICE>__<SECTION>__<VARNAME>=val approach (via env variables) - to be fixed restricts abilities to apply W/A on customers setups
5. TF Operator doesn't fit declared best practices for OpenShift and Operators. Right now it's a single big Operator while best practices are to create separate Operators for each distinct component/role (it's optional but quite recommended to do before releasing it for the first time)
6. TF Operator OpenShift repo better be separate from TF Operator (optional but highly recommended). It's not the case now.
These items are to become subtasks and dealt individually. Most probably more to come.
- Permalink
- Feb 03, 2021

Space shortcuts

Page tree

1 Comment

Alexey Morlang