Skip to content

Commit 8911cad

Browse files
author
Scot Marvin
committed
Merge branch '4.0.2'
2 parents dde1e12 + bfca10a commit 8911cad

12 files changed

+616
-152
lines changed
Loading
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
3+
<concept id="ops_testplan">
4+
<title>Testing Your Deployment</title>
5+
<shortdesc>This topic details what you should test when you want to make sure your deployment is
6+
working. The following suggested test plan contains tasks that ensure DNS, imaging, and storage
7+
are working.</shortdesc>
8+
<conbody>
9+
<section>
10+
<title>DNS</title>
11+
<ul>
12+
<li>Verify that instances can ping their:
13+
<ul>
14+
<li>Private DNS name</li>
15+
<li>Public DNS name</li>
16+
</ul>
17+
</li>
18+
<li>Verify that instances are pingable on their public DNS names from:
19+
<ul>
20+
<li>Outside the cloud</li>
21+
<li>An instance inside the cloud</li>
22+
</ul>
23+
</li>
24+
</ul>
25+
</section>
26+
27+
<section>
28+
<title>Imaging</title>
29+
<ul>
30+
<li>Verify that an EBS-backed image boots successfully</li>
31+
<li>Verify that you can create an image from a running EBS-backed instance</li>
32+
<li>Verify that you can install a new Ubuntu image</li>
33+
<li>Verify that you can deregister an image</li>
34+
<li>Verify that you can import an instance</li>
35+
<li>Verify that you can import a volume</li>
36+
</ul>
37+
</section>
38+
39+
<section>
40+
<title>Walrus</title>
41+
<ul>
42+
<li>Verify that you can make a basic s3cmd request</li>
43+
<li>Verify that you can successfully perform a multi-part upload (use a 1G+ file)</li>
44+
</ul>
45+
</section>
46+
47+
</conbody>
48+
</concept>

content/en_us/admin-guide/ops_ts.dita

+13-1
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,18 @@
55
<shortdesc>This topic details how to find information you need to troubleshoot most problems in
66
your cloud.</shortdesc>
77
<conbody>
8-
<p>Concept definition.</p>
8+
<p>Make sure you understand the mapping between the Eucalyptus components:</p>
9+
<ul>
10+
<li>Cloud Controller (CLC)</li>
11+
<li>User-facing services (UFS)</li>
12+
<li>Walrus</li>
13+
<li>Storage Controller (SC)</li>
14+
<li>Cluster Controller (CC)</li>
15+
<li>Node Controller (NC)</li>
16+
<li>VMware Broker (only if you use VMware)</li>
17+
</ul>
18+
<p>For most problems, the procedure for tracing problems is the same: start at the bottom to
19+
verify the bottom-most component, and then work your way up. If you do this, you can be assured
20+
that the base is solid. This applies to virtually all Eucalyptus components and also works for proactive, targeted monitoring.</p>
921
</conbody>
1022
</concept>
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<!DOCTYPE task PUBLIC "-//OASIS//DTD DITA Task//EN" "task.dtd">
3+
<task id="ops_ts_install">
4+
<title>Problem: install-time checks</title>
5+
<shortdesc/>
6+
<taskbody>
7+
<context>
8+
<p>Eucalyptus offers installation checks for anye Eucalyptus component or service (CLC, Walrus,
9+
SC, NC, SC, services, and more). When Eucalyptus encounters an error, it presents the problem to
10+
the operator. These checks are used for install-time problems. They provide resolutions to some of the fault conditions. </p>
11+
12+
<p>Each problematic condition contains the following information:</p>
13+
<table>
14+
<tgroup cols="2">
15+
<thead>
16+
<row>
17+
<entry>Heading</entry>
18+
<entry>Description</entry>
19+
</row>
20+
</thead>
21+
<tbody>
22+
<row>
23+
<entry>Condition</entry>
24+
<entry>The fault found by Eucalyptus</entry>
25+
</row>
26+
<row>
27+
<entry>Cause</entry>
28+
<entry>The cause of the condition</entry>
29+
</row>
30+
<row>
31+
<entry>Initiator</entry>
32+
<entry>What is at fault</entry>
33+
</row>
34+
<row>
35+
<entry>Location</entry>
36+
<entry>Where to go to fix the fault</entry>
37+
</row>
38+
<row>
39+
<entry>Resolution</entry>
40+
<entry>The steps to take to resolve the fault</entry>
41+
</row>
42+
</tbody>
43+
</tgroup>
44+
</table>
45+
<p><image alt="Sample install check image" placement="break" scale="120" width="619" height="250" href="images/install-check.png"/></p>
46+
<p>For more information about all the faults we support, go to <xref
47+
href="https://github.com/eucalyptus/eucalyptus/tree/testing/util/faults/en_US" scope="external" format="html">https://github.com/eucalyptus/eucalyptus/tree/testing/util/faults/en_US</xref>.</p>
48+
</context>
49+
</taskbody>
50+
</task>

content/en_us/admin-guide/ops_ts_instance_com.dita

+34-15
Original file line numberDiff line numberDiff line change
@@ -5,39 +5,58 @@
55
<shortdesc/>
66
<taskbody>
77
<context>
8-
<p>Use ping from client (not CLC). Can you ping it?:</p>
8+
<p>Use ping from a client (not the CLC). Can you ping it?</p>
99
</context>
1010
<steps-unordered>
1111
<step>
12-
<cmd>Yes: check open ports on security groups, retry connection (ssh, hhtp, etc) - did it work?</cmd>
12+
<cmd>Yes:</cmd>
13+
<info>Check the open ports on security groups and retry connection using SSH or HTTP. Can you
14+
connect now?</info>
1315
<substeps>
1416
<substep>
15-
<cmd>Yes.</cmd>
17+
<cmd>Yes. Okay, then. You're work is done.</cmd>
1618
</substep>
1719
<substep>
18-
<cmd>No: same procedure as if you can't ping it upfront.</cmd>
20+
<cmd>No:</cmd>
21+
<info>Try the same procedure as if you can't ping it up front.</info>
1922
</substep>
2023
</substeps>
21-
24+
2225
</step>
23-
<step><cmd>If the instance is not there, log in as admin and run
24-
<apiname>euca_describe_instance</apiname>. Is the instance there?</cmd>
26+
<step>
27+
<cmd>No:</cmd>
28+
<info>Is your cloud running in Edge networking mode? </info>
2529
<info>
2630
<ul>
27-
<li>If the instance is there, note your A2 and do the following.
31+
<li>Yes:
32+
<p>Run <codeph>euca-describe-nodes</codeph>. Is your instance there? </p>
2833
<ul>
29-
<li>Run <apiname>euca_describe-az verbose</apiname>. </li>
30-
<li>Note the CC IP</li>
31-
<li>Go to the CC log and grep Instance ID.</li>
34+
<li>Yes:
35+
<p>Ping the instance's public IP from the NC. Can you ping it? Check network between
36+
client and NC (this indicates that the problem is not the Eucalyptus network).</p></li>
37+
<li>No:
38+
<p>Check <filepath>eucanetd.log</filepath> and IP tables rules. Make sure the IP address has visible
39+
public IPs and that the IP tables have expected ports opened.</p></li>
3240
</ul>
3341
</li>
34-
<li>If the instance is not there, start over and run a new instance, recreate failure, and
35-
start these steps over.</li>
42+
<li>No, it is not in Edge networking mode: <ol>
43+
<li>Run <codeph>euca-describe-instances</codeph></li>
44+
<li>Note the AZ name.</li>
45+
<li>Run <codeph>euca-describe-AZ verbose</codeph>.</li>
46+
<li>Note the IP for the CC.</li>
47+
<li>Ping the instance's private IP from the CC. <p>Are there error messages?</p>
48+
<ul>
49+
<li>Yes: <p>Check the network connection between the client and the CC.</p></li>
50+
<li>No: <p>Check <filepath>eucanetd.log</filepath> and the IP tables rules. Make sure the IP address has visible
51+
public IPs and that the IP tables have expected ports opened.</p></li>
52+
</ul>
53+
</li>
54+
</ol>
55+
</li>
3656
</ul>
3757
</info>
38-
58+
3959
</step>
4060
</steps-unordered>
4161
</taskbody>
4262
</task>
43-
Original file line numberDiff line numberDiff line change
@@ -1,61 +1,73 @@
11
<?xml version="1.0" encoding="UTF-8"?>
22
<!DOCTYPE task PUBLIC "-//OASIS//DTD DITA Task//EN" "task.dtd">
33
<task id="ops_ts_instance_fails">
4-
<title>Problem: instance runs but fails</title>
5-
<shortdesc/>
6-
<taskbody>
7-
<context>
8-
<p>Run <apiname>euca_describe_nodes</apiname> to verify if instance is there. </p>
9-
</context>
10-
<steps-unordered>
11-
<step>
12-
<cmd>If the instance is there:</cmd>
13-
<substeps>
14-
<substep>
15-
<cmd>Go to NC log for that NC and grep your instance ID and see if the instance is there.</cmd>
16-
<info>
17-
<ul>
18-
<li>If the instance is there: is there error message?
19-
<ul>
20-
<li>If there is an error message, this clues you in to some helpful information</li>
21-
<li>No: go to CC log, grep instance ID</li>
22-
</ul>
23-
</li>
24-
</ul>
25-
</info>
26-
</substep>
27-
<substep>
28-
<cmd>No: go to CC log, grep Instance ID. Is it there error message?</cmd>
29-
<info>
30-
<ul>
31-
<li>Yes: good- this clues you in to some helpful information
32-
</li>
33-
<li>No: grep instance ID in cloud-output.log. Is there error message?
34-
<ul><li>Yes: good - this clues you in to some helpful information</li>
35-
<li>No: grep volume ID in SC log</li></ul></li>
36-
</ul>
37-
</info>
38-
</substep>
39-
</substeps>
40-
41-
</step>
42-
<step><cmd>If the instance is not there, log in as admin and run
43-
<apiname>euca_describe_instance</apiname>. Is the instance there?</cmd>
44-
<info>
45-
<ul>
46-
<li>If the instance is there, note your A2 and do the following.
47-
<ul>
48-
<li>Run <apiname>euca_describe-az verbose</apiname>. </li>
49-
<li>Note the CC IP</li>
50-
<li>Go to the CC log and grep Instance ID.</li>
51-
</ul>
52-
</li>
53-
<li>If the instance is not there, start over and run a new instance, recreate failure, and
54-
start these steps over.</li>
55-
</ul>
56-
</info>
57-
58-
</step>
59-
</steps-unordered>
60-
</taskbody>
4+
<title>Problem: instance runs but fails</title>
5+
<shortdesc/>
6+
<taskbody>
7+
<context>
8+
<p>Run <codeph>euca-describe-nodes</codeph> to verify if instance is there. Is the instance
9+
there?</p>
10+
</context>
11+
<steps-unordered>
12+
<step>
13+
<cmd>Yes:</cmd>
14+
<substeps>
15+
<substep>
16+
<cmd>Go to the <xref href="../troubleshooting-guide/ts_logs.dita">NC log</xref> for that
17+
NC and grep your instance ID. Did you find the instance?</cmd>
18+
<info>
19+
<ul>
20+
<li>Yes: <p>Is there an error message?</p>
21+
<ul>
22+
<li>Yes: <p>This clues you in to some helpful information</p></li>
23+
<li>No: <p>Go to <xref href="../troubleshooting-guide/ts_logs.dita">CC
24+
log</xref> and grep the instance ID.</p></li>
25+
</ul>
26+
</li>
27+
</ul>
28+
</info>
29+
</substep>
30+
<substep>
31+
<cmd>No: </cmd>
32+
<info>Go to the <xref href="../troubleshooting-guide/ts_logs.dita">CC log</xref> and
33+
grep the instance ID. Is it there error message? <ul>
34+
<li>Yes: <p>The error message should give you some helpful information.</p>
35+
</li>
36+
<li>No: <p>grep the instance ID in <xref
37+
href="../troubleshooting-guide/ts_logs.dita">cloud-output.log</xref>. Is there
38+
error message?</p>
39+
<ul>
40+
<li>Yes:<p>The error message should give you some helpful information.</p></li>
41+
<li>No: <p>grep volume ID in <xref href="../troubleshooting-guide/ts_logs.dita"
42+
>SC log</xref>.</p>
43+
</li>
44+
</ul>
45+
</li>
46+
</ul>
47+
</info>
48+
</substep>
49+
</substeps>
50+
51+
</step>
52+
<step>
53+
<cmd>No: </cmd>
54+
<info>Log in as admin and run <codeph>euca-describe-instance</codeph>. Is the instance
55+
there? <ul>
56+
<li>Yes: <ul>
57+
<li>Note your AZ.</li>
58+
<li>Run <codeph>euca-describe-az verbose</codeph>. </li>
59+
<li>Note the CC IP</li>
60+
<li>Go to the <xref href="../troubleshooting-guide/ts_logs.dita">CC log</xref> and
61+
grep the instance ID.</li>
62+
</ul>
63+
</li>
64+
<li>No:
65+
<p>Start over and run a new instance, recreate failure,
66+
and start these steps over.</p></li>
67+
</ul>
68+
</info>
69+
70+
</step>
71+
</steps-unordered>
72+
</taskbody>
6173
</task>
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<!DOCTYPE task PUBLIC "-//OASIS//DTD DITA Task//EN" "task.dtd">
3+
<task id="ops_ts_volume_creation">
4+
<title>Problem: snapshot creation failed</title>
5+
<shortdesc/>
6+
<taskbody>
7+
<context>
8+
<p>On the SC, use <codeph>df</codeph> or <codeph>lvdisplay</codeph> to check the disk space in
9+
<filepath>var/lib/eucalyptus/volumes</filepath>. Is there enough space?</p>
10+
</context>
11+
<steps-unordered>
12+
<step>
13+
<cmd>Yes:</cmd>
14+
<info>Use <codeph>df</codeph> or <codeph>lvdisplay</codeph> to check the disk space in
15+
<filepath>var/lib/eucalyptus/bukkits</filepath>. Is there enough space?</info>
16+
<substeps>
17+
<substep>
18+
<cmd>Yes.</cmd>
19+
<info><ul>
20+
<li>Use <codeph>euca-describe-services</codeph> and note the IP addresses for the OSG and
21+
SC.</li>
22+
<li>SSH to SC and ping the OSG.
23+
<p>Are there error messages?</p>
24+
<ul>
25+
<li>Yes:
26+
<p>Check <xref href="../troubleshooting-guide/ts_logs.dita">the SC and the OSG logs</xref>
27+
for the snapshot ID.</p>
28+
</li>
29+
<li>No:
30+
<p>Check the network connection between the SC and the OSG.</p>
31+
</li>
32+
</ul>
33+
</li>
34+
</ul> </info>
35+
</substep>
36+
<substep>
37+
<cmd>No:</cmd>
38+
<info>Delete volumes or add disk space.</info>
39+
</substep>
40+
</substeps>
41+
42+
</step>
43+
<step>
44+
<cmd>No:</cmd>
45+
<info>Delete volumes or add disk space.</info>
46+
</step>
47+
</steps-unordered>
48+
</taskbody>
49+
</task>

0 commit comments

Comments
 (0)