Sqoop is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases. The version supported by Cloudera Manager is Sqoop 2.
Configure Sqoop
Install Sqoop
Sqoop requires a Sqoop 2 Server, we often collocate the Sqoop 2 Server with Oozie or if necessary on the node running the HDFS NameNode. Try to keep the Sqoop service off nodes running YARN NodeManagers and HBase RegionServers as they will use too much memory.
- From Cloudera Manager, click Add a New Service.
- Select the Sqoop 2 service and add the service to a node, preferably a node running the HDFS NameNode or Oozie. Keep the Sqoop service off nodes running YARN NodeManagers and HBase RegionServers.
Configure Sqoop
Troubleshooting
Sqoop Server Startup Failure: Upgrade required but not allowed
Problem: After an upgrade from CDH 5.0.2 to CDH 5.0.3, Sqoop failed to start with the following error: Server startup failure, Connector registration failed, Upgrade required but not allowed – Connector: generic-jdbc-connector.
Resolution: Add the following property to the Sqoop 2 Server Advanced Configuration Snippet (Safety Valve) for sqoop.properties, under Cloudera Manager, Sqoop Service, Configuration, Sqoop 2 Server Default Group, Advanced:
org.apache.sqoop.connector.autoupgrade=true
After the upgrade has completed successfully, the property can be removed.
Log File: /var/log/sqoop2/sqoop-cmf-sqoop-SQOOP_SERVER-servername01.log.out
Server startup failure
org.apache.sqoop.common.SqoopException: CONN_0007:Connector registration failed
at org.apache.sqoop.connector.ConnectorManager.registerConnectors(ConnectorManager.java:236)
at org.apache.sqoop.connector.ConnectorManager.initialize(ConnectorManager.java:197)
at org.apache.sqoop.connector.ConnectorManager.initialize(ConnectorManager.java:145)
…
Caused by: org.apache.sqoop.common.SqoopException: JDBCREPO_0026:Upgrade required but not allowed – Connector: generic-jdbc-connector
at org.apache.sqoop.repository.JdbcRepository$3.doIt(JdbcRepository.java:190)
at org.apache.sqoop.repository.JdbcRepository.doWithConnection(JdbcRepository.java:90)
at org.apache.sqoop.repository.JdbcRepository.doWithConnection(JdbcRepository.java:61)
…
Sqoop does not start on the Hadoop cluster after a Sqoop service restart
Resolution: Recreating Sqoop Ddatabase after that Sqoop2 start
Log File:
Can’t fetch repository structure version.
org.apache.commons.dbcp.SQLNestedException: Borrow prepareStatement from pool failed
at org.apache.commons.dbcp.PoolingConnection.prepareStatement(PoolingConnection.java:113)
at org.apache.commons.dbcp.DelegatingConnection.prepareStatement(DelegatingConnection.java:281)
at org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.prepareStatement(PoolingDataSource.java:313)
…
Caused by: java.sql.SQLSyntaxErrorException: Schema ‘SQOOP’ does not exist
at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source)
…
Caused by: java.sql.SQLException: Schema ‘SQOOP’ does not exist
at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown Source)
…
Caused by: ERROR 42Y07: Schema ‘SQOOP’ does not exist
at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
at org.apache.derby.impl.sql.catalog.DataDictionaryImpl.getSchemaDescriptor(Unknown Source)
at org.apache.derby.impl.sql.compile.QueryTreeNode.getSchemaDescriptor(Unknown Source)