Troubleshooting deployment¶
Connection is not safe¶
After many deployment attempts, it can happen that the reflector pod is not restarted automatically.
- Check if there is a secret called letsencrypt-secret-aureliusdev in our namespace:
kubectl -n <namespace> get secrets
- If it is not there, then find the reflector pod in the default namespace:
kubectl get all
- Delete reflector pod (A new one will be created automatically):
kubectl -n <namespace> delete pod/<podname>
Flink-jobmanager and taskmanager is not running¶
Flink-jobmanager is not running, and Flink-taskmanager keeps restarting, but other pods are fine.
To check if all pods are running:
kubectl -n <namespace> get all
Go into the Atlas pod, and see the error message:
kubectl -n <namespace> exec -it <pod/chart-id-atlas-0> -- bash
cd opt/apache-atlas-2.2.0/logs
cat application.log
- If you see an error like:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://10.20.129.33:9838/solr: Can not find the specified config set: vertex_index
Then the vertex_index collection could not be created.
To solve it, we can create it manually in Solr client, then restart the Atlas pod.
- We forward port 9838, so we can access Solr web client:
kubectl -n demo port-forward <pod/chart-id-atlas-0> 9838:9838
Open the web client on localhost:9838/solr
Go to the Collections menu, and add a collection.
- Name: vertex_index
- Config set: _default
- maxShardsPer: -1
From another cmd, open the atlas pod again:
kubectl -n <namespace> exec -it <pod/chart-id-atlas-0> -- bash
cd opt/apache-atlas-2.2.0/
bin/atlas_stop.py
nohup bin/atlas_start.py &
- You can exit it with CTR+C and to check if it is running:
jobs
If an entity are not getting created¶
It could be that a flink job has failed.
- Check whether all flink jobs are running. if not, then restart them:
kubectl -n <namespace> exec -it <pod/flink-jobmanager-pod-name> -- bash
cd py_libs/m4i-flink-tasks/scripts
/opt/flink/bin/flink run -d -py <name_of_job>.py
- Determine if the entity was created within the apache atlas.
- Determine if the entity was created in the elastic.
PS. Be aware of resource problems