Google Professional Data Engineer topic 1 question 158 (Page 1) — Google Certifications

1 Topic by Sathish5 2024-03-28 01:39:25

Sathish5
New member
Offline

Topic: Google Professional Data Engineer topic 1 question 158

You need to deploy additional dependencies to all nodes of a Cloud Dataproc cluster at startup using an existing initialization action. Company security policies require that Cloud Dataproc nodes do not have access to the Internet so public initialization actions cannot fetch resources. What should you do?

A.
Deploy the Cloud SQL Proxy on the Cloud Dataproc master
B.
Use an SSH tunnel to give the Cloud Dataproc cluster access to the Internet
C.
Copy all dependencies to a Cloud Storage bucket within your VPC security perimeter
D.
Use Resource Manager to add the service account used by the Cloud Dataproc cluster to the Network User role

2 Reply by [Removed] 2024-03-28 02:48:15

[Removed]
New member
Offline

Re: Google Professional Data Engineer topic 1 question 158

Correct: C

If you create a Dataproc cluster with internal IP addresses only, attempts to access the Internet in an initialization action will fail unless you have configured routes to direct the traffic through a NAT or a VPN gateway. Without access to the Internet, you can enable Private Google Access, and place job dependencies in Cloud Storage; cluster nodes can download the dependencies from Cloud Storage from internal IPs.

3 Reply by AzureDP900 2024-03-28 04:40:18

AzureDP900
New member
Offline

Re: Google Professional Data Engineer topic 1 question 158

Thank you for detailed explanation. C is right

4 Reply by rickywck 2024-03-28 05:25:30

rickywck
New member
Offline

Re: Google Professional Data Engineer topic 1 question 158

Should be C:

https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/init-actions

5 Reply by gcpdataeng 2024-03-28 07:46:06

gcpdataeng
New member
Offline

Re: Google Professional Data Engineer topic 1 question 158

c looks good

6 Reply by barnac1es 2024-03-28 09:10:44

barnac1es
New member
Offline

Re: Google Professional Data Engineer topic 1 question 158

Security Compliance: This option aligns with your company's security policies, which prohibit public Internet access from Cloud Dataproc nodes. Placing the dependencies in a Cloud Storage bucket within your VPC security perimeter ensures that the data remains within your private network.

VPC Security: By placing the dependencies within your VPC security perimeter, you maintain control over network access and can restrict access to the necessary nodes only.

Dataproc Initialization Action: You can use a custom initialization action or script to fetch and install the dependencies from the secure Cloud Storage bucket to the Dataproc cluster nodes during startup.

By copying the dependencies to a secure Cloud Storage bucket and using an initialization action to install them on the Dataproc nodes, you can meet your security requirements while providing the necessary dependencies to your cluster.

7 Reply by knith66 2024-03-28 10:46:19

knith66
New member
Offline

Re: Google Professional Data Engineer topic 1 question 158

C is correct

8 Reply by charline 2024-03-28 12:58:28

charline
New member
Offline

Re: Google Professional Data Engineer topic 1 question 158

C seems good

9 Reply by musumusu 2024-03-28 13:54:51

musumusu
New member
Offline

Re: Google Professional Data Engineer topic 1 question 158

Answer C,
It needs practical experience to understand this question. You create cluster with some package/software i.e dependencies such as python packages that you store in .zip file, then you save a jar file to run the cluster as an application such as you need java while running spark session. and some config yaml file.
These dependencies you can save in bucket and can use to configure cluster from external window , sdk or api. without going into UI.
Then you need to use VPC to access these files

10 Reply by zellck 2024-03-28 16:36:12

zellck
New member
Offline

Re: Google Professional Data Engineer topic 1 question 158

C is the answer.

https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/network#and_vpc-sc_networks
With VPC Service Controls, administrators can define a security perimeter around resources of Google-managed services to control communication to and between those services.

11 Reply by DataEngineer_WideOps 2024-03-28 16:51:44

DataEngineer_WideOps
New member
Offline

Re: Google Professional Data Engineer topic 1 question 158

Without access to the internet, you can enable Private Google Access and place job dependencies in Cloud Storage; cluster nodes can download the dependencies from Cloud Storage from internal IPs.

12 Reply by medeis_jar 2024-03-28 17:56:07

medeis_jar
New member
Offline

Re: Google Professional Data Engineer topic 1 question 158

https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/network#create_a_cloud_dataproc_cluster_with_internal_ip_address_only

13 Reply by Prabusankar 2024-03-28 20:07:51

Prabusankar
New member
Offline

Re: Google Professional Data Engineer topic 1 question 158

When creating a Dataproc cluster, you can specify initialization actions in executables or scripts that Dataproc will run on all nodes in your Dataproc cluster immediately after the cluster is set up. Initialization actions often set up job dependencies, such as installing Python packages, so that jobs can be submitted to the cluster without having to install dependencies when the jobs are run

14 Reply by JG123 2024-03-28 21:25:45

JG123
New member
Offline

Re: Google Professional Data Engineer topic 1 question 158

Correct: C

15 Reply by clouditis 2024-03-29 00:07:16

clouditis
New member
Offline

Re: Google Professional Data Engineer topic 1 question 158

c it is!

16 Reply by Rajokkiyam 2024-03-29 02:03:02

Rajokkiyam
New member
Offline

Re: Google Professional Data Engineer topic 1 question 158

Should be C

17 Reply by [Removed] 2024-03-29 03:42:43

[Removed]
New member
Offline

Re: Google Professional Data Engineer topic 1 question 158

Should be C

18 Reply by jvg637 2024-03-29 05:06:22

jvg637
New member
Offline

Re: Google Professional Data Engineer topic 1 question 158

I think the correct answer might be C instead, due to https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/network#create_a_cloud_dataproc_cluster_with_internal_ip_address_only

Google Professional Data Engineer topic 1 question 158

Posts: 18

1 Topic by Sathish5 2024-03-28 01:39:25

Topic: Google Professional Data Engineer topic 1 question 158

2 Reply by [Removed] 2024-03-28 02:48:15

Re: Google Professional Data Engineer topic 1 question 158

3 Reply by AzureDP900 2024-03-28 04:40:18

Re: Google Professional Data Engineer topic 1 question 158

4 Reply by rickywck 2024-03-28 05:25:30

Re: Google Professional Data Engineer topic 1 question 158

5 Reply by gcpdataeng 2024-03-28 07:46:06

Re: Google Professional Data Engineer topic 1 question 158

6 Reply by barnac1es 2024-03-28 09:10:44

Re: Google Professional Data Engineer topic 1 question 158

7 Reply by knith66 2024-03-28 10:46:19

Re: Google Professional Data Engineer topic 1 question 158

8 Reply by charline 2024-03-28 12:58:28

Re: Google Professional Data Engineer topic 1 question 158

9 Reply by musumusu 2024-03-28 13:54:51

Re: Google Professional Data Engineer topic 1 question 158

10 Reply by zellck 2024-03-28 16:36:12

Re: Google Professional Data Engineer topic 1 question 158

11 Reply by DataEngineer_WideOps 2024-03-28 16:51:44

Re: Google Professional Data Engineer topic 1 question 158

12 Reply by medeis_jar 2024-03-28 17:56:07

Re: Google Professional Data Engineer topic 1 question 158

13 Reply by Prabusankar 2024-03-28 20:07:51

Re: Google Professional Data Engineer topic 1 question 158

14 Reply by JG123 2024-03-28 21:25:45

Re: Google Professional Data Engineer topic 1 question 158

15 Reply by clouditis 2024-03-29 00:07:16

Re: Google Professional Data Engineer topic 1 question 158

16 Reply by Rajokkiyam 2024-03-29 02:03:02

Re: Google Professional Data Engineer topic 1 question 158

17 Reply by [Removed] 2024-03-29 03:42:43

Re: Google Professional Data Engineer topic 1 question 158

18 Reply by jvg637 2024-03-29 05:06:22

Re: Google Professional Data Engineer topic 1 question 158

Posts: 18