LayerClientException('Internal error while getting dataset')

My featureset has successfully been built but it is failing to train the model. My integration is BigQuery. I’ve attached my GitHub repo to reproduce the problem.

LAYER RUN FAILED after 18904ms:
Failed to train model '3dc2c4b1-7615-4f7a-80d4-ac473b36a10f': LayerClientException('Internal error while getting dataset')

Github Repo: Layer_projects/anomaly_detection at main · codebrain001/Layer_projects · GitHub

Hey @CodeBrain, can you please run layer logs (Layer SDK commands | Layer Documentation) on your run and see if that surfaces any issues? If you can’t find the root cause, please feel free to post the logs here so we can help.

Kindly find the logs

layer logs 1e37c914-0a89-43bb-8baf-8cc5011c5bf4
[2021-09-01 12:58:42, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Starting job.
[2021-09-01 12:58:42, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Using selector: EpollSelector
[2021-09-01 12:58:43, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Using selector: EpollSelector
[2021-09-01 12:58:43, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Successfully logged into https://beta.layer.co
[2021-09-01 12:58:43, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Using selector: EpollSelector
[2021-09-01 12:58:43, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Creating ~/source dir
[2021-09-01 12:58:43, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Place __init__.py in ~/source
[2021-09-01 12:58:43, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Download binary(8c107153-d278-4e26-bfab-6733d1ad1bda/52ec880c-8efd-4e80-8bdd-551eb91f8f09/fraud_detection_model_training.tgz) to temp directory
[2021-09-01 12:58:44, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Binary archive fraud_detection_model_training.tgz downloaded and extracted successfully
[2021-09-01 12:58:44, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Installing python dependencies
[2021-09-01 12:58:44, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Requirement already satisfied: scikit-learn>=0.18 in /venv/lib/python3.8/site-packages (from -r /root/source/requirements.txt (line 1)) (0.24.2)
[2021-09-01 12:58:44, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Requirement already satisfied: xgboost>=1.2.0 in /venv/lib/python3.8/site-packages (from -r /root/source/requirements.txt (line 2)) (1.3.3)
[2021-09-01 12:58:44, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Requirement already satisfied: joblib>=0.11 in /venv/lib/python3.8/site-packages (from scikit-learn>=0.18->-r /root/source/requirements.txt (line 1)) (1.0.1)
[2021-09-01 12:58:44, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Requirement already satisfied: scipy>=0.19.1 in /venv/lib/python3.8/site-packages (from scikit-learn>=0.18->-r /root/source/requirements.txt (line 1)) (1.6.0)
[2021-09-01 12:58:44, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Requirement already satisfied: threadpoolctl>=2.0.0 in /venv/lib/python3.8/site-packages (from scikit-learn>=0.18->-r /root/source/requirements.txt (line 1)) (2.2.0)
[2021-09-01 12:58:44, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Requirement already satisfied: numpy>=1.13.3 in /venv/lib/python3.8/site-packages (from scikit-learn>=0.18->-r /root/source/requirements.txt (line 1)) (1.20.2)
[2021-09-01 12:58:45, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Python dependencies installed successfully
[2021-09-01 12:58:45, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Importing user code(model.py) from /source
[2021-09-01 12:58:45, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] train_model function imported successfully
[2021-09-01 12:58:46, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Injecting the dependencies
[2021-09-01 12:58:46, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Annotations: {'train': <class 'layer.client.Train'>, 'tf': Featureset(name='transaction_features', description='', id=UUID('d287ef94-b5fd-4d71-afe9-d57e59e229f4'), datasources=[], features=[], feature_names=[], dependencies=[]), 'return': typing.Any}
[2021-09-01 12:58:46, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Entity dependencies: {'featuresets': {'tf': Featureset(name='transaction_features', description='', id=UUID('d287ef94-b5fd-4d71-afe9-d57e59e229f4'), datasources=[], features=[], feature_names=[], dependencies=[])}, 'models': {}, 'datasets': {}, 'context': None, 'train': 'train'}
[2021-09-01 12:58:46, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Injecting transaction_features featureset with individual features []
[2021-09-01 12:58:46, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Using selector: EpollSelector
[2021-09-01 12:58:46, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Injected dependencies successfully: {'tf': Featureset(name='transaction_features', description='', id=UUID('9a66372e-55a0-4155-b026-ebcb8a0f3ea1'), datasources=[], features=[], feature_names=[], dependencies=[]), 'train': <layer.train.Train object at 0x7fb427e02d60>}
[2021-09-01 12:58:46, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Executing the train_model
[2021-09-01 12:58:46, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Caught exception stacktrace:
[2021-09-01 12:58:46, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Failure during train_model execution LayerClientException('Internal error while getting dataset')
[2021-09-01 12:58:46, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] ['/venv/bin/python -X faulthandler -m pyruntime.model.train_executor' exited with 1]
[2021-09-01 12:58:46, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Module execution process return code ->  1
[2021-09-01 12:58:46, fraud_detection_model, model-training-421cb3d5-a795-4966-a7db-9852e3ec1842-l7bb6] Training job completed

Thank you @CodeBrain. This looks like a bug on our end, we’ll fix it as soon as possible and keep you posted.

1 Like

Thanks, anticipating

Good morning volkan, is there any update on this?

Hi @CodeBrain, we’ve added some more monitoring on our end to understand exactly how this failure is happening. We don’t have a clear root cause yet but we are still working on it. Thank you for your patience.

1 Like

Hey @CodeBrain, would you mind renaming your featureset and trying to run the whole project again? This may be an issue related to the featureset being overridden unexpectedly, and a rename could be a good workaround if that is the case.

1 Like

Hi @volkan I just renamed my featureset and I am still getting the the same error

LAYER RUN FAILED after 23984ms:
Failed to train model 'c73b800d-c77c-4cd0-9e19-94ced98be9f9': LayerClientException('Internal error while getting dataset')

Here is the log result

[2021-09-10 08:26:12, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Starting job.
[2021-09-10 08:26:12, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Using selector: EpollSelector
[2021-09-10 08:26:13, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Using selector: EpollSelector
[2021-09-10 08:26:14, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Successfully logged into https://beta.layer.co
[2021-09-10 08:26:14, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Using selector: EpollSelector
[2021-09-10 08:26:14, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Creating ~/source dir
[2021-09-10 08:26:14, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Place __init__.py in ~/source
[2021-09-10 08:26:14, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Download binary(8c107153-d278-4e26-bfab-6733d1ad1bda/c73b800d-c77c-4cd0-9e19-94ced98be9f9/fraud_detection_model_training.tgz) to temp directory
[2021-09-10 08:26:14, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Binary archive fraud_detection_model_training.tgz downloaded and extracted successfully
[2021-09-10 08:26:14, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Installing python dependencies
[2021-09-10 08:26:14, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Requirement already satisfied: scikit-learn>=0.18 in /venv/lib/python3.8/site-packages (from -r /root/source/requirements.txt (line 1)) (0.24.2)
[2021-09-10 08:26:14, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Requirement already satisfied: xgboost>=1.2.0 in /venv/lib/python3.8/site-packages (from -r /root/source/requirements.txt (line 2)) (1.3.3)
[2021-09-10 08:26:14, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Requirement already satisfied: scipy>=0.19.1 in /venv/lib/python3.8/site-packages (from scikit-learn>=0.18->-r /root/source/requirements.txt (line 1)) (1.6.0)
[2021-09-10 08:26:14, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Requirement already satisfied: threadpoolctl>=2.0.0 in /venv/lib/python3.8/site-packages (from scikit-learn>=0.18->-r /root/source/requirements.txt (line 1)) (2.2.0)
[2021-09-10 08:26:14, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Requirement already satisfied: joblib>=0.11 in /venv/lib/python3.8/site-packages (from scikit-learn>=0.18->-r /root/source/requirements.txt (line 1)) (1.0.1)
[2021-09-10 08:26:14, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Requirement already satisfied: numpy>=1.13.3 in /venv/lib/python3.8/site-packages (from scikit-learn>=0.18->-r /root/source/requirements.txt (line 1)) (1.20.2)
[2021-09-10 08:26:15, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Python dependencies installed successfully
[2021-09-10 08:26:15, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Importing user code(model.py) from /source
[2021-09-10 08:26:16, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] train_model function imported successfully
[2021-09-10 08:26:16, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Injecting the dependencies
[2021-09-10 08:26:16, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Annotations: {'train': <class 'layer.client.Train'>, 'tf': Featureset(name='features', datasource=DatasourceRef(name='', type=<DatasourceType.STORAGE: 'storage'>, id=UUID('46b339f5-b5ca-4bd5-bd2c-97962f45cb15')), description='', id=UUID('31be7fb6-b869-4b06-b2c4-0ada06a25ca5'), features=[], feature_names=[], dependencies=[]), 'return': typing.Any}
[2021-09-10 08:26:16, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Entity dependencies: {'featuresets': {'tf': Featureset(name='features', datasource=DatasourceRef(name='', type=<DatasourceType.STORAGE: 'storage'>, id=UUID('46b339f5-b5ca-4bd5-bd2c-97962f45cb15')), description='', id=UUID('31be7fb6-b869-4b06-b2c4-0ada06a25ca5'), features=[], feature_names=[], dependencies=[])}, 'models': {}, 'datasets': {}, 'context': None, 'train': 'train'}
[2021-09-10 08:26:16, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Injecting features featureset with individual features []
[2021-09-10 08:26:16, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Using selector: EpollSelector
[2021-09-10 08:26:16, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Injected dependencies successfully: {'tf': Featureset(name='features', datasource=DatasourceRef(name='', type=<DatasourceType.STORAGE: 'storage'>, id=UUID('46b339f5-b5ca-4bd5-bd2c-97962f45cb15')), description='', id=UUID('f0364204-b797-4819-8e5d-94ce7718293f'), features=[], feature_names=[], dependencies=[]), 'train': <layer.train.Train object at 0x7f9b73e49f70>}
[2021-09-10 08:26:16, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Executing the train_model
[2021-09-10 08:26:16, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Caught exception stacktrace:
[2021-09-10 08:26:16, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Failure during train_model execution LayerClientException('Internal error while getting dataset')
[2021-09-10 08:26:16, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] ['/venv/bin/python -X faulthandler -m pyruntime.model.train_executor' exited with 1]
[2021-09-10 08:26:16, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Module execution process return code ->  1
[2021-09-10 08:26:16, fraud_detection_model, model-training-c68a769b-37fb-41b5-80af-bf9668f78bf0-k7slq] Training job completed

@volkan still keen on hearing from you, thanks

Hi Brain, we are still looking into this this week. Thanks for sharing more details!

1 Like

@volkan still keen on hearing from you, thanks

Hi @CodeBrain,

We have identified the root cause of this issue. Existing featuresets that are bound to a specific integration (snowflake, bigquery) and that are later rewritten using another integration fail on lookup.

While we work on fixing this (which sits deep in our stack), there’s a workaround: you can rename the featureset to a new name that doesn’t have an existing instance.

I see that @volkan previously suggested this, but the new name was features which existed as well.
Could you try renaming your featureset to something like fraud_detection_features and trying again?

Let me know if this helps.

-Gerard.

1 Like

Hi @CodeBrain,

We just wanted to check if you had a chance to try the workaround we suggested previously?

We hope it helps,

Thank you,
Adrien

Hi Adrien, currently on it. I will give you feedback soon. Thanks for checking on me about the problem.